Insights from the interfaces of HIV-1 envelope (ENV) trimer viral protein GP160 (GP120-GP41)

The Human Immunode(cid:977)iciency Virus (HIV-1) type 1 viral protein is a life threatening virus causing HIV/AIDS in infected humans. The HIV-1 envelope (ENV) trimer glycoprotein GP160 (GP120-GP41) is gaining attention in recent years as a potential vaccine candidate for HIV-1/AIDS. However, the sequence variation and charge polarity at the interacting sites across clades is a short-coming faced in the development of an effective HIV-1 vaccine. We analyzed the interfaces in terms of its interface area, interface size, and interface energies (van der Waals, hydrogen bonds, and electrostatics). The interfaces were divided as dominant ( (cid:21) 60%) and subdominant (<60%) based on van der Waals contribution to total energies. 88% of GP120 and 74% of GP41 interfaces are highly pronounced with van der Waals energy having large interfaces with interface size (98 (cid:6) 65 (GP120) and 73 (cid:6) 65 (GP41)) and interface area (882 (cid:6)


INTRODUCTION
Regardless of the remarkable efforts to develop a vaccine for HIV-1/AIDS has always been a great chal-lenge over the last decade with disappointing results in clinical trials (Shin, 2016). The unsatisfactory clinical trial results from VaxGen's AIDSVAXgp120 vaccine and MRKAd5 HIV-1 Gag/Pol/Nef have been discussed elsewhere (Adis Editorial, 2003;Überla, 2008). This could be due to the viral human molecular mimicry, protein structural architecture, viral protein mutation and glycosylation. Despite the serious biotechnological challenges there is always an ampli ied energy to synthesis ENV trimer spike protein. The reasonable ef icacy shown in the Thai trail vaccine (RV144 -ENV-GP120, Gag and Pro) is promising (Rerks-Ngarm et al., 2009, 2013. Post Thai trial (RV144), the focus is on envelope (ENV) as a vaccine candidate. In addition, ENV GP160 with least homology is selected by performing a sequence comparison between HIV and human proteome (Kangueane et al., 2008).
GP160 ENV trimer spike glycoprotein has gained attention as a potential vaccine candidate in the recent years. Production of native like HIV-1 GP160 envelope trimer glycoprotein is a challenge in designing, developing and validating an effective vaccine from a biochemical, structural and immunological view point (Sanders and Moore, 2017;Doores, 2015). Ringe et al. investigated on the number of factors that are importantly in luencing the design, stability and puri ication of native like HIV-1 envelope trimer glycoprotein. Alsalmi et al. used strep tag method to purify GP160 trimer protein and was resulted with cleaved, uncleaved, fully or partially glycosylated trimers. In addition, they found cleaved gp140 were not required for trimerization, however they played a signi icant role in triggering conformational changes in channelizing the trimers to generate compact three blade propeller shaped trimers. Verkerke et al. used lectin af inity chromatography to purify native like trimers from diverse HIV-1 isolates. The challenges faced in the production, analysis and synthesis of GP160 ENV trimer glycoprotein are reported (Grimm et al., 2015;Guenaga, 2015). Surface mutation, charge polarity and glycosylation and sequence variation between known variants in different clades are the signi icant barriers causing dif iculty in imitating a native-like conformation of the glycoprotein. It is evident that assembling individual GP160 into a trimer spike complex structure is a challenge from a protein-protein interaction viewpoint. A large number of GP120 and GP41 structures are available in the PDB deposited using different biophysical techniques to understand the underlying molecular mechanism of the interacting proteins. Sowmya et al. demonstrated the correlation between sequence polarity and mean Shannon entropy by calculating sequence polarity for surface residues in GP120 and GP41 and concluded stating the use of protein modi ication in the enhancement of HIV-1 vaccine across different clades, blood, and brain. Nilofer et al. characterized the interfaces of GP120-GP120, GP120-GP41 and GP41-GP41 and reported that the interfaces of GP120-GP120 are largely polar. The interfaces of GP120-GP41 and GP41-GP41 are characteristics of polar and nonpolar residues. We characterize a manually curated dataset of 121 GP120 and 85 GP41 (Figure 1) protein interfaces reported by Nilofer et al. using interface features including interface area, interface size (number of residues at the interface), van der Waals, hydrogen bonds and electrostatics, to verify our previous inding stating that small protein interfaces are rich in electrostatics are often linked to regulatory proteins (Nilofer et al., 2020). The residues at the interface are displayed using CPK depiction (Discovery Studio ® (Systèmes, 2020).

Dataset
We used a dataset of 206 interfaces manually curated as reported by Nilofer et al.. It consists of 121 GP120 (Table 1) and 85 GP41 (Table 2) interfaces. It should be noted that GP120 structures in the PDB are available in ligand-bound state.

Interface area
Interface area was estimated for each of 121 interfaces of GP120 and 85 interfaces of GP41 using Naccess (Hubbard and Thornton, 1993). Naccess uses Lee and Richards method (Lee and Richards, 1971), wherein a probe with radius 1.4Å (Jones and Thornton, 1996) roll over the protein complex in monomer state and dimer state to ind the accessible surface area and the interface area using delta ASA. Delta ASA (change in accessible surface area) is calculated using a formula: [ASA (Monomer subunit 1) + ASA (Monomer subunit 2) -AB (Dimer complex)]/2.

Interface size & interface energies
Interface size and interface energies were estimated for each of 121 interfaces of GP120 and GP41 using PPCheck (Sukhwal and Sowdhamini, 2015). PPCheck uses distance criteria to identify the noncovalent interactions between atoms of the two interacting proteins. It should be noted that the role of water is ignored in this analysis.

Dominant and subdominant van der Waals interface
Interfaces with van der Waals contribution ≥60% to total energy (sum of van der Waals, hydrogen bonds and electrostatics) is de ined as dominant interfaces, while interfaces with van der Waals contribution <60% to total energy is de ined as subdominant interfaces. A cutoff of 60% was used as the larger part of van der Waals contribution was at  this cutoff on a scale of 0-100% and hence used.

Statistical analysis
We calculated interface energies of GP120 and GP41 using the statistical (Microsoft ® Of ice Excel (version 2003)) variables including mean, mode, distribution, standard deviation and frequency at de inite bin and range. We also carried out multiple linear regressions analysis for each interface with interface size against van der Waals, hydrogen bonds, electrostatic, total energy and interface area using regression tool. Its co-ef icient of determination (r 2 ) was predicted with an evaluation of p-value using ANOVA (statistical test) at 95% con idence limit.

RESULTS AND DISCUSSION
The HIV-1 envelope trimer glycoprotein GP160 is a potential vaccine candidate for HIV-1/AIDS (Burton et al., 2004). Structural data of GP160 available in the PDB are always found to be coupled with ligand shows the degree of stability of GP160 without supportive ligand (Moore et al., 1992). The trimer interfaces are unstable when produced invitro and this may be due to its sequence composition and structural conformation (Moore et al., 1990;Chen et al., 2005). Nilofer et al. reported that the interfaces of GP120 are largely polar whereas the interfaces of GP120-GP41 and GP41 are characteristics of equal contribution of polar and non-polar residues. Interfaces with high polarity have an immense impact on protein's surface, immunological and stability properties. High polarity at the interface is bottleneck in the invitro synthesis and production of GP160 in a stable form. The instability of GP160 could also be due to the complicatedness in mimicking the invivo environment in invitro for protein folding and assembly of the complex (Abagyan and Batalov, 1997;Kinjo et al., 2001).

Figure 2: Mean interface size and interface area of GP120 and GP41 protein interfaces.
Therefore, we characterized the interfaces of GP120 and GP41 using interface area, interface size and interface energies using PPCheck (identi ies noncovalent interactions using distance criteria). To verify our pervious indings, we used the manually curated dataset of 121 and 85 interfaces of GP120 and GP41 proteins. The statistical analysis show that the mean interface size (98±65 (GP120) and 73±65 (GP41)) and interface area (882±1166Å 2 (GP120) and 921±1288Å 2 (GP41)) to be in close proximity for GP120 and GP41 interfaces (Figure 2). In contrast to our previous study we observed, most of the interfaces to have an interface area <1000Å 2 in both GP120 (60%) and GP41 (71%) and about 25% of GP120 and 19% of GP41 to have interface area between 1500Å 2 to 2000Å 2 (Figure 3).

Figure 3: Graph showing interfaces with interface area (Å 2 ) for GP120 and GP41.
Subsequently, we described each interface of GP120 and GP41 using van der Waals, H-bond, electrostatics and total energies along with their varying proportion of contribution at the interface. Thus, we calculated each individual contribution in percentage towards total energy. We observed interfaces to have high percentage of van der Waals (77%) and a low percentage of hydrogen bonds (12%) and electrostatics (11%) on average for GP120 and GP41 complexes (Figure 4). In addition, we noticed the interfaces of GP120 and GP41 to be normally distributed with increasing percentage of van der Waals ( Figure 5). While a proportion of the interface  decrease with increasing percentage of hydrogen bonds and electrostatics unlike van der Waals energy. It should be noted that interfaces of GP120 and GP41 are similar with the percentage contribution of van der Waals, hydrogen bonds, electrostatics, interface area and interface size. We further grouped the interfaces of GP120 and GP41 as dominant (≥60%) and subdominant (<60%) van der Waals based on its contribution towards total energy. As a result dominant interfaces have ≥60% of van der Waals with less magnitude of hydrogen bonds and electrostatics. We observed majority of interfaces of GP120 (88%) and GP41 (74%) to be van der Waals dominant with less that 10% contribution of hydrogen bonds and electrostatics while the remaining 12% of GP120 and 26% of GP41 have subdominant van der Waals with more than 15% of hydrogen bonds and electrostatic ( Figure 6). We performed statistical analysis on dominant van der Waals and subdominant van der Waals interfaces to    highlight the contribution of hydrogen bonds and electrostatics in the small interfaces. We observed subdominant van der Waals interfaces of GP120 and GP41 to have three fold more of both hydrogen bonds and electrostatics when compared to dominant van der Waals interfaces (Figure 7). Furthermore, we noticed subdominant van der Waals interfaces to be more pronounced with more than 20% of hydrogen bonds and electrostatics distinct compared to dominant van der Waals interfaces (Figure 8). It is evident from (Figure 9) that the interface size and interface area of small interfaces are only half when compared to the large interfaces. Most of the interfaces with subdominant van der Waals have interface area less than 500Å 2 (Figure 10). Therefore, it was stated that the small interfaces with subdominant van der Waals energy and small interface area are rich in electrostatics. However, in context to the small interfaces (subdominant van der Waals) of GP120 and GP41, small interfaces of GP120 are rich in hydrogen bonds and GP41 is rich in electrostatics. Interface size increases with interface area is a known fact and hence we correlated interface size and interface energies. It was reported that total Figure 10: Interface area in large and small interfaces among GP120 and GP41 is shown. Figure 11: Correlation between interface energy and interface size of GP120 and GP41 interfaces is shown.

Figure 12: Correlation between interface energy and interface size for GP120 and GP41 interfaces in terms of large and small interfaces is shown.
energy, van der Waals and hydrogen bonds increase with interface size but electrostatics decrease with increasing interface size (Nilofer et al., 2020). While the results of our current study shows that van der Waals and total energies of GP120 and GP41 interfaces increase with interface size but hydrogen bonds and electrostatics decrease with increasing interface size ( Figure 11). Hence we divided our interfaces as large (dominant van der Waals) and small (subdominant van der Waals) interface based on their percent contribution towards total energy to check for correlation. It has also been reported that in small interfaces, total energy, van der Waals and hydrogen bonds decreases considerably with the increasing interface size whereas electrostatics moderately increases with interface size (Nilofer et al., 2020). But, this is not the case with the small interfaces of GP120 and GP41. Surprisingly, we found electrostatics (r 2 =0.63) (Figure 12p) to be highly pronounced in GP41 interfaces with subdominant van der Waals having van der Waals (r 2 =0.23) ( Figure 12h) and without hydrogen bonds (r 2 =0) (Figure 12p) contribution. Contrastingly, we observed the small interfaces of GP120 to be highly stabilized by hydrogen bonds (r 2 =0.59) ( Figure 12k) followed by electrostatics (r 2 =0.20) (Figure 12o). Hence, we report that hydrogen bonds (r 2 =0.59) (Figure 12k) increases with the interface size in the small interfaces of GP120 and electrostatics (r 2 =0.63) (Figure 12p) increases with the interface size in the small interfaces of GP41.

CONCLUSIONS
GP120 viral proteins interact with GP41 to form GP160 the HIV-1 trimer glycoprotein. Statistical analysis on the interfaces of GP120 and GP41 using interface area, interface size and interface energies including van der Waals, hydrogen bonds and electrostatics demonstrate that they are similar. 88% of GP120 and 74% of GP41 interfaces have large interface area and interface size with dominant van der Waals energy; while 12% of GP120 and 26% of GP41 interfaces have small interface area and interface size with subdominant van der Waals energy.
In addition, small interfaces were observed to have three fold more of hydrogen bonds and electrostatics than large interfaces. It is shown hydrogen bonds to increase with interface size in the small interfaces of GP120; while electrostatics to increase with interface size in small interfaces of GP41 in absence of hydrogen bonds. These insights from the interfaces of GP120 and GP41 shows that our previous inding stating that small interfaces with small interface area are rich in electrostatics holds true in case of GP41 but not in the case of GP120.