Immunogenetic profiles of 9 human herpes virus envelope glycoproteins

Human herpes viruses (HHV) are ubiquitous and have been implicated in numerous long-term health conditions. Since the association between viral exposure and long-term health impacts is partially influenced by variation in human leukocyte antigen (HLA) genes, we evaluated in silico the binding affinities of 9 HHV envelope glycoproteins with 127 common HLA Class I and Class II molecules. The findings show substantial variability in HHV binding affinity across viruses, HLA Class, HLA genes, and HLA alleles. Specific findings were as follows: (1) the predicted binding affinities of HHVs were characterized by four distinct groupings—[HHV1, HHV2], [HHV3, HHV4, HHV5], [HHV6A], [HHV6B, HHV7, HHV8]—with relatively lower binding affinities for HHV1, HHV2, and HHV6a compared to other HHVs; (2) significantly higher binding affinity was found for HLA Class I relative to Class II; (3) analyses within each class demonstrated that alleles of the C gene (for Class I) and DRB1 gene (for Class II) had the highest binding affinities; and (4) for each virus, predicted binding affinity to specific alleles varied, with HHV6a having the lowest affinity for HHV-HLA complexes, and HHV3, HHV4, and HHV5 having the highest. Since HLA-antigen binding is the first step in initiating an immune response to foreign antigens, these relative differences in HHV binding affinities are likely to influence long-term health impacts such that the cells infected with viruses associated with higher binding affinities across common HLA alleles may be more reduced in numbers, thereby lowering the potential for long-term sequelae of their infections.

The association between viral exposure and long-term health impacts is partially influenced by individual variation in human leukocyte antigen (HLA) genes [7][8][9][10] .Located on chromosome 6, the HLA region is the most highly polymorphic of the human genome 11 .Small differences, even single amino acid changes in the binding groove, can alter HLA-antigen binding 12 , thereby influencing foreign antigen elimination and disease susceptibility 8,11 .HLA genes code for cell-surface glycoproteins that present bound viral epitopes to T cells, signaling immune system responses aimed at virus elimination.Each individual possesses 12 HLA alleles, inherited in a Mendelian fashion, including two of each of the HLA Class I genes (HLA-A, HLA-B, and HLA-C) and two of each of the HLA Class II genes (HLA-DR, HLA-DQ, HLA-DP).Glycoproteins of the two classes operate in concert albeit via different mechanisms and timeframes.HLA Class I, which are expressed on all nucleated cells, signal destruction of an infected cell by binding and transporting cytosolic virus epitopes to the cell surface for presentation to cytotoxic CD8 + T cells.HLA Class II molecules, which are expressed on lymphocytes and professional antigen presenting cells, bind and present endocytosed exogenous antigen epitopes to CD4 + T cells to stimulate antibody production and long-term adaptive immunity.Developing long-term immunity in the event of virus re-exposure occurs over weeks to months compared to the rapid elimination of infected cells via the HLA Class I system 13 .
It is reasonable to hypothesize that rapid elimination of viral antigens by HLA Class I molecules may reduce viral latency and, consequently, reduce long-term sequelae of viral infections.In light of the high degree of HLA polymorphism, however, the immune response to a HLA-virus antigen complex varies, meaning that some HLA-antigen complexes will be more efficient in mounting an immune response to a given virus than others.Indeed, this is exemplified by slowed progression and control of human immunodeficiency virus (HIV) in carriers of certain HLA alleles (for example, HLA-B*27:05 and B*57:01), as compared to rapid disease progression associated with other HLA alleles (e.g., B*35:01) 14 .A recent review synthesizing genetic associations of HLA with herpesvirus infection and disease found that most HLA genetic associations are virus-or disease-specific, although certain allotypes were broadly associated with susceptibility or control across herpes viruses 15 .We hypothesize that observed differences in HLA associations with HHVs are related to immune response of HHV-HLA pairs, the first step of which hinges on sufficient HLA-virus antigen binding affinity.Thus, in this study we used an in silico approach to evaluate the binding affinity of 9 HHV proteins with 127 common HLA Class I (n = 69) and Class II (n = 58) antigens.

Predicted binding affinities (PBA) of HHV proteins with HLA class I and II molecules
The HHV PBAs of the 9 HHV (Table 1) and 127 HLA Class I and II alleles (Tables 2 and 3) are given in Table S1 in Supplementary Material.The effects of HLA Class on these PBAs were evaluated using a repeated measures analysis of variance (ANOVA), where HHV PBA was the "Within-Subjects" factor and HLA Class was the "Between-Subjects" fixed factor.The results revealed a highly significant effect of HHV on PBA (Fig. 1A; P < 0.001, Greenhouse-Geisser test) and four distinct PBA groupings (color-coded in Fig. 1A): [HHV1, HHV2], [HHV3, HHV4, HHV5], [HHV6A], [HHV6B, HHV7, HHV8].The same groupings were found using multidimensional scaling (MDS), occupying 4 distinct quadrants in the MDS map (Fig. 1B).
With respect to HLA Class, Class I PBA was significantly higher than Class II (Fig. 2A, P < 0.001, F-test).Finally, with respect to HLA genes, we used a repeated measures ANOVA to evaluate the effect of Gene ("Between-Subjects fixed factor) within each Class.We found a statistically significant effect of Gene for both Class I (P = 0.01, F-test) and Class II (P = 0.038, F-test), with genes C and DRB1 having the highest PBAs (Fig. 2B,C, respectively).

Variation of PBA across HLA alleles
The analyses above evaluated the overall effects of Virus, Class and Gene on PBA.Here we show the individual PBA values for all 127 HLA alleles and the 9 HHV viruses in Figs. 3 and 4. The PBA value of zero corresponds to lowest percentile rank of 1 (PBA = ln(1) = 0; see Methods), a conservative threshold for high binding affinity.It can be seen that (a) most PBAs are of high affinity (> 0), (b) the number of low affinity PBA differ across viruses, being highest for HHV6A and lowest for HHV3, HHV4 and HHV5, and (c) in all viruses but HHV6A, low affinity PBAs are confined to Class II.This variation in PBA across alleles and viruses is captured in the heatmap of Fig. 5.

Association of PBA with protein length
Since we tested all possible 9-AA (for Class I) and 15-AA (for Class II) epitopes, it is possible that PBA estimates could depend on protein length since longer proteins would afford more AA sequences to which HLA molecules could potentially bind.We evaluated this hypothesis by computing, for each allele, the Pearson correlation between PBA values and the number of amino acids in a protein (Table 1).We found the following (Table 4).(a) 18/127 (14.1%) of the correlations were negative (10 Class I, 8 Class II), speaking against the hypothesis above; an example is shown in Fig. 6A.(b) 95/127 (74.8%) correlations were not statistically significant, at a nominal threshold of P < 0.05, uncorrected for multiple comparisons; no correlation was significant at a threshold of P < 0.05/127, i.e.P < 0.000394, after the conservative Bonferroni correction or P < 0.000404, after the less conservative Šidák correction (see Methods); and (c) the percent of PBA variance explained by protein AA length (i.e. 100 x r 2 ), irrespective of statistical significance, ranged from 0.001 to 78.5%, but was heavily skewed towards small values (median = 16.7%, mean ± SEM 24.9 ± 2.1%).An example of high positive correlation is shown in Fig. 6B.Overall, these results indicate that there is only a small overall contribution of protein length to PBA values.

Discussion
Here we evaluated binding affinities of 127 common HLA Class I and Class II alleles with envelope glycoproteins of 9 HHVs and documented substantial variability in the predicted binding affinities of HHVs with regard to HLA Class, gene, and allele.Since HLA-HHV antigen binding is a critical initial step in mounting an adaptive immune response to a viral infection, these findings highlight relative differences in binding affinities of specific HHVs with common HLA Class I and Class II alleles and point to enhanced ability of certain HLA alleles to facilitate a more effective adaptive immune system response to HHVs that bind with higher affinity to common HLA alleles.In the absence of high affinity HLA-HHV complex binding, the virus may persist 16 , establish latency 17 , and contribute to subsequent long-term health impacts 18,19 .
The HLA-HHV binding affinities here fell into four distinct groups, two of which were characterized by distinctly lower binding affinities.Specifically, HHV1/HHV2 and HHV6a, neurotropic viruses that have been implicated (to varying degrees) with neurological conditions [20][21][22][23] , had low binding affinities overall (Fig. 1) and particularly for HLA Class II (Figs. 3, 4, 5).HHV1 and HHV2, commonly referred to as herpes simplex virus -1 (HSV-1) and -2 (HSV-2), cause lifelong infections characterized by periods of latency and reactivation in the form of orolabial (HSV-1) or genital (HSV-2) lesions.The seroprevalence of HSV-1, which is typically acquired by oral contact, in individuals between the ages of 14 and 49 is estimated at 54% whereas the seroprevalence of HSV-2, which is typically sexually transmitted, is 16% in the same age range 24 .Extensive research supports a prominent role of HSV-1 in dementia 20 ; HSV-2 has been shown to increase HIV acquisition 25 and is associated with neurological complications 26 .Part of the roseola family, HHV6A infection is very common early in life, and, as a neurotropic virus, is commonly detected in the brains of healthy individuals as well as those with www.nature.com/scientificreports/neurological diseases [27][28][29] .Notably, HHV6A has been shown to integrate into the host germline, allowing for generational transmission of the HHV6A viral genome 30 .The relatively low binding affinity of HHV1, HHV2, and HHV6A with common HLA alleles may hinder efficient elimination of those viruses, potentially contributing to the development of subsequent neurological effects.Despite their relatively higher overall binding affinities, several of the HHVs comprising the other two groupings (HHV3, HHV4, HHV5; HHV6b, HHV7, HHV8) have     also been implicated in long-term health conditions including various cancers, neurological, and autoimmune disorders 19,[31][32][33][34][35] .It is worth noting that even for the 6 viruses associated with higher overall predicted binding affinities to common HLA alleles, there was still substantial variability in predicted binding affinity across alleles (Figs. 3, 4, 5), indicating that some are preferentially able to bind with high affinity compared to others.Thus, the findings suggest that some HHVs are overall more readily handled by HLA-mediated adaptive immune system mechanisms than others, due to their superior binding affinity with common HLA alleles.Nonetheless, for all of the HHVs, the effectiveness of the adaptive immune system response is predicated on high-affinity HLA-HHV binding.To that end, we hypothesize that HHV latency and reactivation as well as long-term disease associations are HLA-dependent.
It is noteworthy that the predicted binding affinities of HHVs with Class I HLA molecules were significantly higher than those of Class II.HLA Class I and Class II play different albeit complementary roles in the adaptive immune response to viruses.HLA Class I promotes rapid elimination of infected cells via cytotoxic CD8 + T cells, whereas Class II contributes to long-term protection via antibody production and immunological memory, a process that can take months 13 .We propose that rapid elimination of HHVs by binding with HLA Class I to form high affinity HHV-HLA complexes may reduce the potential of the virus to establish latency, thereby reducing the likelihood of viral reactivation and associated diseases, as has been established in the case of early efficient elimination of HIV by certain Class I alleles 36 .This rapid elimination via Class I mediated cytotoxic T cells does not preclude development of Class II mediated antibody production; indeed, much of the population is seropositive for one or more HHVs (1,2,3).In some cases, seropositivity is associated with disease 19,37 , suggesting that antibodies reflective of seropositivity do not necessarily confer protection.For both HLA Class I and II, however, adaptive immune protection against HHVs are HLA-dependent in that absence of sufficient HHV-HLA binding affinity inhibits presentation to CD8 + or CD4 + T cells necessary for signaling destruction of infected cells (Class I) or antibody production (Class II), permitting the viral antigen to persist.
Class I HLA-C and Class II HLA-DRB1 genes were associated with higher binding affinity to HHVs than other genes within each respective HLA class.Compared to other classical HLA Class I genes (A and B), HLA-C is unique in that it is less frequently expressed on the cell surface but is the only HLA Class I gene for which virtually all allotypes serve as a natural ligand for multiple types of killer-cell immunoglobulin-like receptors (KIR) which are expressed on natural killer cells that are known to control infected cells efficiently 38 .As reviewed elsewhere 38 , mounting research has documented that HLA-C, in combination with KIR, influences control and/ or progression of various viral infections including HIV, hepatitis C, and CMV, a member of the herpesviridae family (HHV5).Alleles of the DRB1 gene have been associated with both protection and susceptibility to various conditions including numerous autoimmune disorders 7,39,40 , many of which are associated with virus exposure 41 .The current study shines the spotlight more prominently on Class I HLA-C and Class II HLA-DRB1 in influencing the outcome and progression of various HHVs and points to superior binding affinity of HLA-C and HLA-DRB1 as an important underlying mechanism.
The present findings, which document that binding affinity of a given HHV varies across HLA Class I and Class II alleles, must be considered with several qualifications.First, to ensure their survival, HHVs are notorious for utilizing immune evasion mechanisms, several of which involve downregulation of HLA or interference with transport or loading of antigenic peptides which may impair viral elimination even in the case of a strong antiviral immune response 42 .Second, the focus here is on virus antigen-HLA binding since that is a necessary first step in adaptive immunity; the extent of the human immune response is also partially dependent on immunogenicity of the antigen-HLA complex.Thus, it is possible that some of the high affinity virus-HLA associations documented here may not produce a sufficient immunogenic response against viral antigens.Third, the analyses focused on 127 common HLA Class I and Class II alleles; it is possible that other less common alleles that were not investigated here are capable of forming highly immunogenic complexes.Nonetheless, focusing on globally common

HLA alleles
We used 69 common HLA Class I alleles (Table 2) and 58 common HLA Class II alleles (Table 3) that we have employed in previous studies 54

In silico determination of predicted binding affinity of HLA Class I and Class II molecules
Predicted binding affinities were obtained for viral protein epitopes using the Immune Epitope Database (IEDB) NetMHCpan (ver.4.1) tool 56,57 .More specifically, we used the sliding window approach [58][59][60] to test exhaustively all possible linear 9-mer (for HLA-I predictions) and 15-mer (for HLA-II predictions) AA residue epitopes of the 9 viral proteins analyzed (Table 1).The method is illustrated in Figs.7 and 8 for the HHV4 virus protein.
For each epitope-HLA molecule tested, this tool gives, as an output, the percentile rank of binding affinity of the HLA molecule and the epitope among predicted binding affinities of the same HLA molecule to a large number of different peptides of the same AA length; the smaller the percentile rank, the better the binding affinity.Now, given a protein of N amino acid length and an epitope length of k AA, there are N-k binding affinity predictions, i.e.N-k percentile ranks.Of these predictions, for each viral protein and HLA molecule tested, we retained the lowest percentile rank (LPR) as the best possible binding affinity of the protein-HLA molecule pair.We then applied two transformations on LPR.First, we took its inverse, so that higher values mean better binding affinities for more intuitive interpretation: The LPR ′ distribution was heavily skewed to the left (Fig. 9A), resembling an exponential distribution.There- fore, LPR ′ values were (natural) log transformed to normalize its distribution for quantitative analyses (Fig. 9B): Give the logarithmic transformation above, PBA > 0 indicate LPR ′ > 1 , whereas PBA < 0 indicate LPR ′ < 1.

Statistical significance uncorrected for multiple comparisons
For this condition, with a = 0.05 , P < 0.05 indicated a statistically significant effect for each one of 127 correla- tions computed between viral PBA and number of amino acids in a viral protein.Frequency distributions of LPR ′ (Eq. 1) (A, skewed to the left) and its log-transformed PBA values (Eq.2) (B, unimodal).

Fig. 1 .
Fig. 1. (A) Mean predicted binding affinity (± SEM) of the 9 HHV proteins analyzed across the 127 HLA alleles.(B) Plot of the derived HHV protein configuration yielded by the multidimensional scaling analysis.Viral proteins are color-coded to highlight the 4 distinct PBA groups.

Fig. 2 .
Fig. 2. A Mean predicted binding affinity (± SEM) of the 9 HHV proteins in the 2 HLA Classes (A), Class I genes (B), and Class II genes (C).
Finally, for each virus, we analyzed binding affinity of a single protein of a single strain-specifically, an envelope glycoprotein that is involved in viral entry into the cell.It is unclear to what extent the present findings extend to other proteins and other strains of each of the viruses investigated; although such analyses are beyond the scope of the present paper, they are currently underway.

Table 2 .
The 69 HLA Class I alleles used.

Table 3 .
The 58 HLA Class II alleles used.
. Briefly, we obtained the population frequency in 2019 of 127 common HLA Class I and Class II alleles from 14 Continental Western European Countries (Austria, Belgium, Denmark, Finland, France, Germany, Greece, Italy, Netherlands, Portugal, Norway, Spain, Sweden, and Switzerland).There was a total of 2746 entries of alleles from these countries, comprising 844 distinct alleles.Of those, 69 Class I alleles

Table 4 .
55rrelation between HHV PBA and the number of amino acids in the HHV protein (N = 9).58ClassII alleles occurred in 9 or more countries, with a minimum frequency (in any country) of 0.01.Although those alleles were selected based on their frequency in Europe, they have been found to the common overall across 6 world populations55, namely African/African American (AFA), Asian/Pacific Islands (API), European/European descent (EURO), Middle East/North Coast of Africa (MENA), South or Central America/ Hispanic/Latino (HIS), Native American (NAM), Unknown/not asked/multiple ancestries/other (UKN), and total (TOTAL).All but allele A*36:01 were Common in each one of the 6 populations above; allele A*36:01 was Intermediate in API and EURO populations, and Common in the remainder populations, and was Common across the 6 populations.