West Nile virus–associated HLA‐DRB1 alleles in the Greek population: A structural perspective

The HLA system plays a significant role via the regulation of the immune system and contributes to the progression and protection of many diseases. In our previous study, several HLA‐DRB1 alleles were found to have a susceptible or protective role toward infection and neuroinvasion of West Nile Virus (WNV) in the Greek population. As expected, the majority of polymorphic positions are located in the peptide‐binding region of the molecule. In the present work, the structure of these alleles was studied in silico, to examine the effect of polymorphism on the conformation of DRB1 proteins, with the aspect of WNV association. More specifically, molecular dynamics simulations were used for structural prediction of 23 available alleles. These modeled alleles were evaluated using root‐mean‐square deviation (RMSD) and root‐mean‐square fluctuation analysis. Low RMSD values indicate that different alleles have similar structures. Furthermore, low fluctuation was observed in the peptide‐binding region between alleles with the higher and the lowest RMSD values. These findings indicate that probably variable residues do not affect the behavior of DRB1 alleles in WNV disease, by causing structural differences between them.


INTRODUCTION
The development of a specific immune response to a pathogen involves several factors and pathways. MHC class II molecules are constitutively expressed by professional antigen-presenting cells and play a major role in antigen processing and presentation. 1,2 The MHC class II region in humans (HLA) contains three distinct gene loci, HLA-DP, -DQ, and -DR. 3 HLA class II molecules consist of two chains, α and β. DPA, DQA, and DRA genes code for chain α, while DPB, DQB, and DRB genes code for chain β. Except for DRA, which is dimorphic with just one substitution at the intracytoplasmic position 227, the MHC class II locus is highly polymorphic. Up to date 233 DPA1 alleles, 279 DQA1 alleles, 1674 DPB1 alleles, 1968 DQB1 alleles, and 2909 DRB1 alleles have been identified. [4][5][6] The high polymorphism observed in this locus is mainly located in exon 2 of these molecules that codes for the peptide-binding region (PBR). The PBR resides in a peptide antigen-binding groove, formed between the α1 and α2 helices with a β-pleated sheet forming its floor. In this groove, HLA molecules feature binding pockets into which anchoring peptides of the antigen can fit. The fitting is stabilized by hydrogen bonds and salt bridges between the side chains of α and β chains and along the backbone of the peptide. The pockets contributing mostly to peptide binding and selectivity are pockets P1, P4, P6, P7, and P10 7-9 (the pockets are numbered along the peptide relative to a large usually hydrophobic pocket near the peptide-binding site). Pocket P1 holds an important role in peptide anchoring into the groove, pockets P6 and P9 have a role in peptide specificity, and pockets P4 and P7 have double role that includes antigen specificity and modulation of pockets-peptide interactions. 10 It has been shown in DRB1 molecules that P10 can also influence the peptide-binding selectivity. 9 The amino acid sequence of these pockets differentiates between HLA alleles, having a direct impact on the binding specificity and affinity of the HLA molecule. 11 In our previous study, we examined the polymorphism of HLA-DRB1 (among other HLA class II genes), in a cohort of West Nile virus (WNV) cases and a control group, both of Greek origin, to investigate its contribution to the infection and outcome of the WNV disease. Alleles that indicate protection against or susceptibility to WNV infection as well as alleles that may protect from neuroinvasion were identified. More specifically, 24 DRB1 alleles were identified in the groups studied. DRB1 alleles *16:02, *08, and *11:15 were present only in patients with severe WNV disease and were strongly associated with neuroinvasion. By contrast, alleles DRB1*08:04 and DRB1*14:04 were associated with susceptibility to WNV, while alleles DRB1*11:18, DRB1*11:02, DRB1*11:03, and DRB1*08:30 were identified only in individuals with mild clinical symptoms, thus associated with a pivotal protective role against neuroinvasion. 12 As PBR is the most polymorphic region on the groove, could the "gathering" of different amino acids in a region affect its structure when comparing the structure of different alleles? Here, we investigate, using protein modeling, whether the polymorphism of PBR between the previously identified alleles leads to structural differences between DRB1 molecules that are further translated into different behavior toward the WNV infection and disease progression.

Modeling
The following protocol has been followed to construct models of DRA/DRB complexes to evaluate the structural consequences of allele sequence differences. The crystal structure of the PDB entry code 2WBJ 15 has been used as starting conformation for molecular dynamics (MD) simulations in explicit water. All amino acid alterations made in the 2WBJ structure, to create the amino acid sequences of proteins under study, are described in detail in Supporting Information: Supplementary 2. Two missing residues (109 L, 110 Q) located in a loop were added (extracellular domain, not implicated in the binding groove of the molecule). The protonation state in DR complexes was assigned at pH 7.0 by PROPKA server and then adjusted accordingly. 16,17 The dimensions of the simulation box used for all systems were 80 × 80 × 96 Å 3 .
Each system was solvated by TIP3P water molecules and neutralized with sodium and chloride ions. The solvated systems were first energy minimized and then subjected to 15,000 steps of NPT MD simulation at 310 K and 1 Atm with the Particle Mesh Ewald algorithm for handling electrostatics to adjust the simulation cell dimensions. The resulting systems have been then simulated in the NVT ensemble for 3 ns at 310 K. The final systems (the last frame of the MD trajectories) were subjected to energy minimization. The obtained minimized protein conformations were used for further structural analysis. The MD program NAMD (Theoretical and Computational Biophysics Group) 18 and the CHARMM force field for proteins and nucleic acids 19 were used for the simulations. All calculations were performed in the absence of antigenic peptides to enable direct comparison of the structural characteristics of different molecules.

Root-mean-square deviation/root-mean-square fluctuation calculations
The coordinate root-mean-square deviation (RMSD) average was used for evaluating the degree of structural differences between DRB1 alleles. The backbone atoms including alpha carbon atoms, carbonyl carbon atoms, and nitrogen atoms of DRB1 molecules were selected to calculate RMSD values. Root-mean-square fluctuation Additional information about the steps followed and the programs and scripts we used are included in Supporting Information: Supplementary 1.

Sequence differentiation
The amino acid sequences of 23 DRB1 alleles were aligned and polymorphic sites were found throughout the sequence. As it was expected, the vast majority of polymorphisms were found in the PBR (exon 2 residues). Variation in other regions was also observed between alleles ( Figure 1). However, because our interest was focused on PBR, variability found only in exon 2 was included in the protein models produced (Supporting Information: Supplementary 2).

Root-mean-square deviation analysis
The RMSD analysis is used to describe a change in the tertiary structure of the protein during simulation. 26 In Figure 2 a tendency for structural stabilization can be observed for all molecule complexes. The results of the RMSD average of the molecules we studied (chain B) are presented in detail in Supporting Information: Supplementary 3. Based on the low RMSD values found, no significant overall backbone structural difference between the susceptible and protective alleles was noticed.

RMSF analysis
To estimate how the allelic polymorphism affects the dynamics of the backbone atoms of DRB1 molecules, the RMSF values were calculated of the structures with the higher and lower RMSD values. Higher RMSF values indicate greater flexibility of the residues during the MD simulation. Although no structural differences >2 Å between the DRB1 molecules and crystal structure were identified, we selected the molecules with the higher (molecules DRB1*11:04, 03:01, 13:02, 11:02, and 14:04, with values between 2196 and 2305 Å) and lower RMSDs (molecules DRB1*14:01, 16:01, 08:04 and 11:03, with values between 1699 and 1852 Å) to perform RMSF analysis. The deviations from the crystal structure observed were mainly located on flexible parts between secondary structure elements. No significant fluctuations were identified (Figure 3).

DISCUSSION
In the last decade, the characterization of disease-associated polymorphisms in the HLA alleles includes data from threedimensional studies. Although the vast majority of the F I G U R E 1 The amino acid sequence of the HLA-DRB1 alleles that were identified in our previous study and were used for protein modeling. The allele nomenclature of each sequence is shown. The sequence of the first allele (DRB1*11:04) is fully shown, while for the other alleles only amino acids that differentiate from the first sequence are shown. Bullets are used when the amino acids are the same. Dashes indicate missing regions.  polymorphism-disease associations was found in the onedimensional context of linear sequences, it became clear that focusing on the protein level in which they function could uncover the relative contribution of individual polymorphisms and their potential interactions. Understanding the polymorphism at structural level provides information for the intermediating pathogenic mechanisms while prompting the development of new therapies. 27 Until now, 45 HLA-DRB1 structures are available in the RCSB Protein Data Bank. [28][29][30] However, only DRB1 alleles 04:01, 01:01, 15:01, 11:01, and DR3 are represented in the database. Here, we studied the conformational differences between 22 DRB1 structures that were produced using protein modeling. All alleles studied were identified in our previous work where we examined the possible association, in Greek population, between HLA-DRB1 alleles and WNV disease. Several DRB1 alleles have been associated with WNV infection and disease progression.
In our study, RMSD results showed a tendency for structural stabilization for all allelic complexes. RMSD values indicate the overall structural differences of the HLA-DRB1 alleles, compared with the crystal structure. No significant differentiation was observed between the alleles studied and the initial structure, and when comparing the RMSD values between the alleles (low RMSD values). Similarly, focusing the structural comparison on the alleles, based on the WNV disease association (susceptible/protective), high structural similarity was observed. These findings are in co-ordinance with previous observations about pulmonary tuberculosis and leishmaniasis (alleles 07:01, 11:01, 09:01) 31 and multiple sclerosis (alleles 15:01 and 16:01). 32 RMSD values are determined among the entire structure, so any variation between the alleles could affect RMSD values. Here, the amino acid changes of the reference structure were performed only for the residues that could influence binding pockets P1, P4, P6, P7, and P9. Therefore, the RMSD values found were more representative of the stability of the binding groove, rather than the entire protein.
As the structures studied differed in one or more positions, all of them located in the peptide-binding groove of the molecules, we compared the RMSF of each position in the alleles with the higher and lower RMSD values, to find the "greatest" differences between them. Again, no significant fluctuation was observed between the two groups. Based on our previous results, 12 the frequency of alleles DRB1*08:01, DRB1*16:02, DRB1*11:02, DRB1*14:04, DRB1*11:192, DRB1*08, DRB1*11:15, DRB1*08:04, DRB1*11:18, DRB1*11:03, and DRB1*08:30 were found to be associated with either control or cases group and with susceptibility/resistance against West Nile Neuroinvasive Disease. Interestingly, when comparing the alleles of each group (associated with disease and included in RMSF analysis), a limited number of common alleles was observed. Even in the case of alleles 08:04, 11:02, and 14:04 that were found in both allele groups, only DRB1*08:04 showed little fluctuation in some residues that are adjoining to residues that belong to peptide-binding pockets.
In conclusion, structures of several HLA-DRB1 alleles were predicted using computational methods. To our knowledge, this is the first study in relation to WNV disease and exploration of 3D structures of HLA-DRB1 alleles focusing on binding pockets. High structural similarity and low residue fluctuations were found between the alleles studied. In addition, no differentiation between "susceptible" and "protective" alleles was found. These results seem to suggest that the association of HLA-DRB1 alleles with different roles against WNV infection and disease progression is not due to backbone structural differences between the different alleles. Several studies have underlined the different interaction pattern between different alleles and a given peptide, suggesting that protein-protein interactions have a pivotal role in DRB1-peptide binding. Although this is the result of different amino acids in key positions, it seems that amino acid properties such as charge, hydrophobicity/philicity, and volume are "responsible," rather than great structural differences of the molecule's cleft. Accordingly, in a recent study, HLA-DRB1 cleft in complex with antigenic peptides has shown great rigidity and lack of substantial conformational flexibility. 33 This structural stringency could be due to the interactions of HLA class II molecules with other immunity genes that bind with a wide range of different proteins and alleles as HLA-DM and T-cell receptor, dictating the conservation of certain structural conformation.

DISCLOSURE
The authors declare that no competing interests exist.