HLA-C*07:01 and HLA-DQB1*02:01 protect against white matter hyperintensities and deterioration of cognitive function: A population-based cohort study

Background: Neuroinflammation and aberrant immune regulation are increasingly implicated in the pathophysiology of white matter hyperintensities (WMH), an imaging marker of cerebrovascular pathologies and predictor of cognitive impairment. The role of human leukocyte antigen (HLA) genes, critical in immunoregulation and associated with susceptibility to neurodegenerative diseases, in WMH pathophysiology remains unexplored. Methods: We performed association analyses between classical HLA alleles and WMH volume, derived from MRI scans of 38 302 participants in the UK Biobank. To identify independent functional alleles driving these associations, we conducted conditional forward stepwise regression and lasso regression. We further investigated whether these functional alleles showed consistent associations with WMH across subgroups characterized by varying levels of clinical determinants. Additionally, we validated the clinical relevance of the identified alleles by examining their association with cognitive function (n = 147 549) and dementia (n = 460 029) in a larger cohort. Findings: Four HLA alleles (DQB1*02:01, DRB1*03:01, C*07:01, and B*08:01) showed an association with reduced WMH volume after Bonferroni correction for multiple comparisons. Among these alleles, DQB1*02:01 exhibited the most significant association ( β = (cid:0) 0.041, 95 % CI: (cid:0) 0.060 to (cid:0) 0.023, p = 1.04 × 10 (cid:0) 5 ). Forward selection and lasso regression analyses indicated that DQB1*02:01 and C*07:01 primarily drove this association. The protective effect against WMH conferred by DQB1*02:01 and C*07:01 persisted in clinically relevant sub-groups, with a stronger effect observed in older participants. Carrying DQB1*02:01 and C*07:01 was associated with higher cognitive function, but no association with dementia was found. Interpretation: Our population-based findings support the involvement of immune-associated mechanisms, particularly both HLA class I and class II genes, in the pathogenesis of WMH and subsequent consequence of cognitive functions.


Introduction
White matter hyperintensities (WMH) are characterized by increased brightness on T2-weighted or fluid attenuated inversion recovery (FLAIR) magnetic resonance imaging (MRI) scans, which indicate altered water content in white matter fibres.These changes in water content are primarily attributed to pathologies such as chronic plasma leakage due to blood-brain barrier (BBB) dysfunction, infection-related oedema, and demyelination (Wardlaw et al., 2015).The clinical significance of WMH has been well established; increased WMH volume has been linked to poorer cognitive outcomes and a heightened risk of dementia (Debette & Markus, 2010), making it a potential endpoint in clinical trials.Additionally, genetics have a strong influence on susceptibility to WMH (Turner et al., 2004).
Most of the pathological changes associated with WMH can be attributed to immune response and neuroinflammation.For example, an immune reaction triggered by infection or vaccination may cross-react with myelin, leading to demyelination (Casserly et al., 2017).Furthermore, chronic inflammation can affect endothelial function, exacerbating atherosclerosis and causing disruption of the BBB, which in turn results in hypoperfusion of white matter (Wardlaw et al., 2019).Pathological studies have suggested the presence of activated microglia, the resident macrophages of the brain, in areas of WMH (Fernando et al., 2004;Gouw et al., 2008).Population studies have demonstrated an association between circulating peripheral proinflammatory markers (e. g., C-reactive protein and interleukin-6) and WMH (Dijk et al., 2005;Nadkarni et al., 2016).An epigenome-wide association study has implicated genes that influence WMH volume, and these genes intersect with pathways related to the immune response (Yang et al., 2023).
The human leukocyte antigens (HLA) genes, which encode cell surface molecules responsible for antigen presentation, are essential for human immune surveillance and pathogen elimination.As the HLA class I and II loci are the most polymorphic genes in the human genome, they encode a highly diverse repertoire of HLA molecules in the population.This diversity results in differences in people's ability to mount an immune response and eliminate pathogens, which in turn causes variable levels of inflammation and cell damage after infection.Recent research indicates that HLA molecules may also significantly impact brain health and disease, as certain HLA class I and II alleles have been found to be associated with dementia (Bellenguez et al., 2022;Lindbohm et al., 2022), although the underlying mechanism is not yet clear.One plausible explanation is that HLA-antigen mismatch or an overactive immune response might trigger an inflammation cascade, causing cumulative damage to white matter, including vascular dysfunction and demyelination, ultimately leading to radiological manifestation as WMH and damaging the essential connectivity by which human cognitive function is organized.However, due to the limited sample sizes in most imaging studies and the lack of high-resolution HLA genotyping data, few studies have investigated the direct association between specific HLA alleles and WMH.
This study had three main objectives.Firstly, we aimed to analyse the association between 177 individual HLA alleles and the volume of WMH.Secondly, among those alleles associated with WMH, we investigated whether the associations were independently driven by one or multiple alleles, and examined potential existence of modifying effects by age, sex, APOE ε4 status, lifestyle, and disease history.Finally, we examined the impact of the functional HLA alleles on clinically relevant WMH-related outcomes, including cognitive function and the risk of developing dementia.

Study design and participants
The UK Biobank longitudinal cohort enrolled more than 500 000 participants aged 40-69 years between 200640-69 years between and 201040-69 years between (Sudlow et al., 2015)).Baseline assessments were conducted at 22 research centres across England, Scotland, and Wales.These assessments included participants completing a touchscreen questionnaire covering extensive sociodemographic, lifestyle, and health-related information, a verbal interview with a trained nurse regarding past and current medical conditions, and the provision of blood samples for genotyping.
Additionally, a set of unsupervised cognitive tests was administered as part of the baseline touchscreen questionnaire.Incident health outcomes were identified through periodic linkage to electronic health records.
Starting in 2014, the UK Biobank imaging study aimed to re-invite 100 000 participants from the baseline assessment for multi-modal imaging assessments (Littlejohns et al., 2020).During the imaging visit, participants underwent MRI scans at four imaging centres (Bristol, Cheadle, Newcastle, and Reading) using standardized protocols for brain, heart, and body imaging on a 3 Tesla Siemens Skyra scanner (software VD13) and a standard Siemens 32-channel head coil.For this study, we utilized the latest imaging data (released in September 2022), which included first-scan brain MRI data from over 40 000 participants.The touchscreen questionnaires and nurse-led verbal interviews for medical conditions administered at baseline were also repeated to collect updated data.
The participant selection process is illustrated in Supplementary Fig. 1.The study included 488 168 participants with HLA genotyping data.The study was restricted to individuals of white European descent due to the higher accuracy of HLA gene imputation in this population (UK Biobank, 2016), and the UK Biobank's HLA imputation utility has solely been validated in white individuals (Bycroft et al., 2018).In the exploratory study, participants with prior diagnoses of stroke, multiple sclerosis, parkinsonism, dementia, or other neurodegenerative/demyelinating conditions were excluded, resulting in a final sample of 38 302 participants.In the validation study, all participants with available outcome data were included, whereas those with dementia diagnoses prior to the assessment were excluded from the cognitive function analysis.This resulted in sample sizes of 147 549 and 460 029 participants for baseline cognitive function and dementia analysis, respectively.
Ethics approval for the UK Biobank study was obtained from the North West Multi-centre Research Ethics Committee, and written informed consent was obtained from all participants.This research has been conducted under Application Number 69741.

HLA imputation
We utilized the July 2017 release of imputed genetic data from approximately 490 000 individuals in UK Biobank.Single nucleotide polymorphisms (SNP) chip genotyping was conducted using two similar arrays from Affymetrix, namely UK BiLEVE and UKB Axiom, which covered approximately 50 000 and 450 000 individuals, respectively.Quality control and imputation for the SNP data were conducted by UK Biobank (Bycroft et al., 2018).
Imputation at a 4-digit resolution was performed for the 11 classical HLA genes (HLA-A, HLA-B, HLA-C, HLA-DRB1, HLA-DRB3, HLA-DRB4, HLA-DRB5, HLA-DQA1, HLA-DQB1, HLA-DPA1, and HLA-DPB1) using HLA*IMP:02 and a genetically diverse reference panel.To assess the accuracy of imputation, a cross-validation experiment was conducted in the reference panel samples.For samples of European ancestry, with a posterior probability call threshold of 0.7, the imputation accuracy exceeded 96 % across all loci (UK Biobank, 2016).
To validate the utility of UKB HLA imputation, association tests were conducted for 11 self-reported immune-mediated diseases that have known HLA associations.The identified HLA alleles exhibited consistency with previous studies in terms of direction and effect sizes (Bycroft et al., 2018).

White matter hyperintensities
UK Biobank conducted the processing and quality control of raw brain MRI data (Alfaro-Almagro et al., 2018).WMH volumes were generated by the UK Biobank using Brain Intensity AbNormality Classification Algorithm (BIANCA), an automated, supervised method for detecting WMH (Griffanti et al., 2016).BIANCA utilizes the k-nearest neighbour (k-NN) algorithm and information from different MRI Y. Gao et al. modalities (T1-weighted and T2 fluid attenuation inversion recovery (T2-FLAIR)) as inputs to output the probability of voxel classification as WMH.The WMH volume calculated based on BIANCA exhibited a good correlation with visual ratings and age (Griffanti et al., 2016).WMH volumes were log-transformed due to their skewed distribution.

Cognitive and dementia outcomes
Four cognitive tests were administered to participants at baseline (Fawns-Ritchie & Deary, 2020), including: (1) Reaction time, a symbolmatching task that required participants to quickly press a button if the displayed cards were identical.The score was calculated as the mean duration to the first button press for matching pairs.(2) Visual memory, participants were shown a 3 × 4 matrix of cards containing 6 pairs of cards with different images.The cards were initially displayed face-up and then turned face down.Participants were required to match all the pairs by touching the back of the cards.The score was determined by the number of errors made during the matching process.(3) Fluid intelligence, comprising 13 questions assessing verbal and numerical reasoning ability, with scores ranging from 0 to 13 based on correct answers.(4) Prospective memory, involving the performance of a specific behaviour later in the assessment session based on a single instruction given at the beginning.Participants were scored as 1 if they completed the task on the first attempt and 0 otherwise.These four tests demonstrated moderate-to-high validity and test-retest reliability (Fawns-Ritchie & Deary, 2020), as well as predictive ability for incident all-cause dementia during a three-to eight-year prospective follow-up (Calvin et al., 2019).Raw scores for all tests, except for prospective memory which is a binary variable, were standardized within five-year age strata.Higher z-scores indicate better performance.
Dementia cases were identified through baseline self-reported history, hospital inpatient data (primary or secondary diagnosis of dementia), or records of death listing dementia as the cause.

Covariates and effect modifiers
Age and sex information were obtained from the central registry.The first 10 principal components (PCs) of genetic ancestry were derived from baseline genotyping data.Imaging-related covariates included head size, head motion, head and scanner table position, and imaging centre.Following the UK Biobank guidelines (Alfaro-Almagro et al., 2021), head size, motion, and position were demedianed and normalized using the median absolute deviation across all participants.Outliers were defined as values exceeding 8. Any outliers or missing values within each site were replaced with the median value specific to that site, and then normalized to obtain a zero mean and one standard deviation.
APOE genotypes were determined using two SNPs, rs7412 and rs429358.Participants carrying at least one copy of APOE ε4 were classified as APOE ε4 carriers.
Lifestyle was comprehensively evaluated by assessing nine wellestablished health-related factors during the imaging study.These factors included smoking status, alcohol intake, physical activity, television viewing time, sleep duration, and consumption of fruit, vegetables, oily fish, red meat, and processed meat.Each factor was categorized into healthy and unhealthy categories based on a published study (Foster et al., 2018).Participants in the healthy category were assigned 1 point, while those in the unhealthy categories received 0 points.The sum score of the nine factors was then grouped into two categories: unhealthy (scored 0-6) and healthy (scored 7-9).
History of hospital-treated infections was defined as having primary or secondary diagnoses of infectious diseases from hospital inpatient records prior to the imaging study.Histories of other diseases were obtained from hospital inpatient records and through self-reported medical history collected during baseline and imaging study verbal interviews.Supplementary Table 1-2 provide additional details on the definition of covariates and codes for the variables involved.

Statistical analyses
We conducted a stepwise assessment of HLA gene contributions to WMH.First, in the exploration study, we examined the association between all polymorphic HLA alleles with carrier frequency above 0.1 % in our sample and log-transformed WMH volume using linear regression.We adjusted for age at the imaging study, sex, genotyping array (UK BiLEVE or UKB Axiom), first 10 PCs of genetic ancestry, and imagingrelated covariates.To account for multiple comparisons, a Bonferroni correction was applied by multiplying the original p-values by the number of tests.In the main analysis, we evaluated the dominant effect of HLA genes by coding individuals as either possessing or not possessing at least one copy of the allele (a binary exposure), based on a posterior probability cutoff of 0.7.In the sensitivity analysis, we evaluated the additive effect of HLA genes by categorizing individuals according to number of effect alleles at each locus.Individuals with a posterior probability less than 0.7 were classified as having 0 alleles, those between 0.7 and 1.7 as having one allele, and those greater than or equal to 1.7 as having two alleles.
Next, for HLA alleles that demonstrated a Bonferroni-adjusted pvalue < 0.05 for their dominant effect, we conducted a conditional forward stepwise regression to identify potential functional alleles driving the associations, using a significance threshold of p-value < 0.05.Additionally, to account for potential high correlations among the HLA alleles, we performed lasso regression as a sensitivity analysis to validate the screening results of the functional alleles.
We assessed the associations between the functional alleles and WMH volume, stratified by potential effect modifiers such as lifestyle, history of hypertension, hospital-treated infections, and autoimmune diseases.Interaction terms between the functional allele and each effect modifier were included in the models.
Furthermore, we validated the effect of the functional HLA alleles by examining its associations with cognitive function assessed at baseline and with dementia.Logistic or linear regression models were used for cognitive function analysis, adjusting for age at assessment, sex, genotyping array, and first 10 PCs of genetic ancestry.Logistic regression models were used for the dementia outcome, adjusting for birth year, sex, genotyping array, and first 10 PCs of genetic ancestry.

Results
For the exploratory study, a total of 38 302 participants were included, with a mean age of 64.5 years at the time of the imaging study, ), all exhibiting similar protective effects (Fig. 1).Among these four alleles, the most significant and impactful association was observed for the DQB1*02:01 allele, with individuals carrying this allele having a lower logtransformed WMH volume at p = 1.04 × 10 − 5 (β = − 0.041, 95 % CI: − 0.060 to − 0.023).Additionally, C*07:01 and B*08:01 showed an additive effect on WMH, with an increased number of alleles leading to further decreases in WMH volume.
In the forward selection model, analysing all four HLA alleles jointly, DRB3*03:01 and B*08:01 alleles were excluded, while DQB1*02:01 and C*07:01 alleles remained as independent alleles associated with WMH (Table 2).The lasso regression also confirmed the retention of these two alleles, with the effect estimates of other alleles shrinking to zero.Hence, our further stratification and validation analyses focused solely on DQB1*02:01 and C*07:01.
The two protective alleles demonstrated consistent magnitudes of association with WMH volume within clinically relevant subgroups and exhibited similar trends across these subgroups (Table 3).We observed a significant interaction between DQB1*02:01 and age on the WMH volume (p for interaction = 0.022).Specifically, a stronger association was noted in participants above 65 years at the time of imaging study (β = − 0.06, 95 % CI: − 0.09 to − 0.03, p < 0.001), compared with participants younger than 65 (β = − 0.01, 95 % CI: − 0.04 to 0.01, p = 0.283).DQB1*02:01 also had a stronger association with WMH volume in those with unhealthy lifestyles compared to healthy ones (β = − 0.08, 95 % CI: − 0.11 to − 0.04, p < 0.001 vs. β = − 0.03, 95 % CI: − 0.05 to − 0.01, p = 0.011; p for interaction = 0.023).The significant association of DQB1*02:01 and C*07:01 with WMH remained consistent across sex and risk factors for WMH, including a history of hypertension and the APOE ε4 genotype.While we observed a significant association only in participants without a history of hospital-treated infection and autoimmune disease, the interaction between the protective alleles and these factors was not significant.
The validation study conducted on a larger sample with baseline cognitive function data revealed significant associations between both DQB1*02:01 and C*07:01 alleles with higher fluid intelligence and better visual memory (Fig. 2).Additionally, carrying the C*07:01 allele was associated with higher performance in prospective memory.There was no evidence of an association between the DQB1*02:01 or C*07:01 alleles and the risk of developing dementia.

Discussion
In this study, we comprehensively analysed the association between classical HLA genes and WMH in a large sample of up to 38 302 individuals.We identified four HLA alleles that showed significant associations with WMH.Notably, among these alleles, DQB1*02:01 and C*07:01 emerged as independent functional protective factors against WMH.The association between these two alleles and WMH remained significant across various subgroups, including those stratified by established risk factors for WMH, such as hypertension status and APOE ε4 status.To further validate the clinical relevance of DQB1*02:01 and C*07:01, we analysed a larger sample of 147 549 individuals, confirming its consistent association with higher cognitive function measured at baseline.
Emerging evidence suggests an immune-related pathophysiology of WMH, as indicated by associations with markers of inflammation and immune function such as C-reactive protein (Dijk et al., 2005), leukocyte count (Kim et al., 2011), and immune-inflammation index (derived from neutrophil, lymphocyte, and platelet counts) (Nam et al., 2022).Furthermore, white matter lesions have been detected in individuals who have experienced septic shock (Polito et al., 2013).However, the extent to which immune dysfunction originates from genetic factors or is triggered by severe infection and subsequent release of proinflammatory cytokines, or both, remains largely unexplored.Our findings support the involvement of immune-related genetic factors in the development of WMH by identifying multiple protective variants within the HLA region, known for its critical involvement in immune system regulation.Particularly, HLA class I (C*07:01) and class II genes (DQB1*02:01) demonstrated independent and similar protective effect to WMH volume, indicating the involvement of both natural killer (NK) and CD4+ cell-mediated immune processes in the pathological mechanism.The HLA is a class of genes commonly to be co-inherited with each other, a phenomenon known as linkage disequilibrium (LD).Among the four WMH-associated alleles identified, LD is well established between HLA-DQB1 and DRB1 loci.Although DQB1*02:01 is statistically mapped to the lead variant in our study, further functional experiments are warrant to confirm its biological causality.
Compared to prior studies, one study conducted a hypothesis-free  SE, standard error.Forward selection was performed using the step function in R (https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/step), employing the Akaike information criterion (AIC) for predictor addition or removal.The process concluded when no further reduction in AIC could be attained.Covariates were retained in the model, while stepwise selection was applied to HLA alleles.Lasso regression was conducted using the R package "glmnet" (https://CRAN.R-project.org/package=glmnet).The regularization parameter λ was determined by selecting the value that yielded the lowest mean cross-validated error.
SNP-based GWAS analysis on WMH and 96 markers of white matter integrity (Persyn et al., 2020).However, this study did not identify any genome-wide significant loci associated with WMH within the HLA region.Another study examined the association between HLA alleles and 1348 brain imaging phenotypes derived from various imaging modalities in the UK Biobank (Bian et al., 2022), and detected only two (C*07:01 and DQA1*05:01) protective alleles against WMH.Our study extends this finding, confirming the association with C*07:01 and identifying three additional alleles that are in the same AH8.1 haplotype with DQA1*05:01.It is important to note that these studies did not prioritize HLA genes or specifically focus on WMH; instead, they tested thousands of associations and employed Bonferroni correction to control the genome-wide false-positive rate at 5 %, which limited their power to detect all WMH-related differences explained by HLA alleles (Tam et al., 2019).Furthermore, these studies utilized earlier released UK Biobank imaging data, resulting in smaller sample sizes.In contrast, our analysis focused solely on HLA alleles, utilized the most recent and largest available dataset, thereby enhancing the likelihood of identifying true positive associations.
The association between the protective alleles and WMH in our study   was specifically significant among participants aged ≥ 65 years, highlighting a potential role of these HLA alleles in mitigating the effects of aging-related immune dysfunction, known as immunosenescence.Furthermore, this association remained consistent across various established risk factors for WMH, including hypertension (Wartolowska & Webb, 2021) and APOE ε4 status (Lyall et al., 2020).These findings suggest that HLA genetic factors may have an additional predictive value for WMH beyond these known factors and inform more targeted pathological and treatment investigations.Notably, a stronger association was observed in participants without a history of hospital-treated infection and autoimmune disease, which may be attributed to differences in subgroup sample sizes, indicated by an insignificant interaction effect.Future research should explore the complex interplay between HLA genetics and immune-related diseases in greater detail.
Haplotype AH8.1, known as the "autoimmune haplotype," is marked by a persistent pro-inflammatory state with elevated autoantibodies, circulating immune complexes, and TNF-α levels even in healthy carriers (Price et al., 1999).Heightened immune responses in AH8.1 carriers may predispose them to various autoimmune diseases (Gambino et al., 2018), yet they also enhance pathogen recognition and clearance, potentially contributing to AH8.1′s positive selection and high frequency in Caucasians (Crespi & Go, 2015).Studies indicate that AH8.1 is linked to a delayed onset of bacterial lung infections in cystic fibrosis patients (Laki et al., 2006), and a reduced occurrence of septic shock (Aladzsity et al., 2011).Notably, sepsis-induced brain lesions predominantly affect white matter (Sharshar et al., 2007), with WMH serving as a core neuroimaging feature of septic shock (Orhun et al., 2019).The pathway from sepsis to cerebral vascular injuries, as indicated by WMH, and subsequent cognitive decline has been well-established (Annane & Sharshar, 2015).Furthermore, there is evidence linking hospital-treated infection to larger WMH volume and cognitive impairment (Beydoun et al., 2023;Gracner et al., 2021).
Our consistent associations across participants independent of infections and autoimmune diseases might also be facilitated through the complement component 4 (C4), given the negative correlation between AH8.1-derived alleles such as DQB1*02 and the C4A allele (Price et al., 1999;Sekar et al., 2016).This is particularly significant considering the effect of the C4A locus; it can cause excessive complement deposition and heightened microglial activity (Sellgren et al., 2019).These in turn lead to an over-pruning of synapses (Sellgren et al., 2019), and myelin damages, clinical manifestations of which are presented as WMH (Lee et al., 2019).Moreover, overexpression of C4A, but not C4B, is associated with an increased susceptibility with schizophrenia, a condition characterized by cognitive dysfunction (Sekar et al., 2016).
Our study utilized a large cohort with validated HLA genetic data and high-quality brain imaging data, providing robust statistical power to detect the contributions of multiple HLA alleles to WMH.Moreover, comprehensive evaluation of lifestyle and medical conditions during the imaging study enabled us to explore potential effect modifications by various factors.Additionally, by utilizing cognitive tests administered to the entire cohort and accessing electronic health records for the entire cohort, we investigated the potential clinical relevance of the HLA alleles across the spectrum of cognitive to dementia outcomes.
Several limitations need to be considered.First, while the imputation accuracy of HLA alleles for individuals of European ancestry surpasses 96 %, employing direct sequencing in future studies would enhance finemapping accuracy.Second, due to the predominance of white European participants and the focus on optimizing HLA imputation accuracy for Europeans in the UK Biobank (UK Biobank, 2016), our study exclusively included individuals of European ancestry.Further investigations are required to extend the generalizability of this association to individuals from diverse ethnic backgrounds.Lastly, our findings may be susceptible to collider bias due to the restriction of analyses to a sub-cohort of individuals who participated in the imaging study, potentially resulting in an underestimation of the protective effect of the HLA alleles on WMH, as these individuals may be healthier compared to the overall UK Biobank cohort (Lyall et al., 2022).

Conclusions
In summary, our study provides evidence of the protective effect of HLA alleles DQB1*02:01 and C*07:01 against WMH, which is supported by their associations with higher cognitive function.These findings support the involvement of immune-related mechanisms and HLA genetic basis in the pathogenesis of WMH.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.Significant HLA alleles associated with the white matter hyperintensities after Bonferroni correction.a The p-value for trend in the additive model was determined by treating number of HLA alleles as an ordinal variable and employing the lm function in R to conduct the test for linear trend.

Fig. 2 .
Fig. 2. Association of the WMH-related HLA alleles with cognitive outcomes.*p<0.05,†p<0.001.a A higher odds ratio for prospective memory indicates a greater likelihood of successfully completing the task.

Table 2
Potential functional HLA alleles identified by forward selection and lasso regression.

Table 3
Association between HLA and WMH in clinically relevant subgroups.