Predicted pH-dependent stability of SARS-CoV-2 spike protein trimer from interfacial acidic groups

Graphical abstract


Introduction
The spike glycoprotein (S protein) of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has received much attention, as the primary point of attachment between virus and ACE2 receptor on a cell surface [1], and as a target in vaccine design [2]. An original focus on the pre-fusion S protein trimer structure in closed and open states, with receptor binding domain (RBD) down and up, respectively, has expanded to include a further RBD down state, termed locked [3], first observed with in a structure with linoleic acid bound [4]. Here, the terms up and down are used to delineate RBD position, so that the open form maps to RBD up, but both closed and locked forms map to RBD down. This 3 state pre-fusion S trimer picture will be expanded with greater characterisation of post-fusion trimers [5], but the transitions between pre-fusion trimers are of key interest for the early steps in cell entry, and also for the export of fusion competent virus. Further, spike protein trimer expressed at the cell surface is able to mediate fusion between cells [6].
The role of pH has been studied for cell entry and exit of coronaviruses. Both plasma membrane fusion and endosomal mediated fusion pathways can be used [7], with fusion primed by S1/S2 spike protein cleavage during the biosynthetic exit pathway [8]. S1/S2 and S2/S2 0 cleavage sites, and the fusion peptide, are critical locations in considering spike trimer conformational transitions [9]. Most cryo-electron microscopy (cryo-EM) structures of spike trimers have been resolved at pH 7.5 -8. Studies at mild acidic pH have found structural transitions [3,10], interpreted as relatively favouring closed over locked forms as pH is increased from mild acidic to neutral [3]. Other cryo-EM analysis shows changes in the fraction of RBD up and down conformations between pH values of 6.5 and 8 [11]. Cell biological investigation of the exit pathway for newly synthesised SARS-CoV-2 virus has revealed a novel mechanism where lysosomes are repurposed for viral transit to the cell surface, with a moderate de-acidification of their mean pH from 4.7 to 5.7 [12]. Other, theoretical, work has looked at the wider effects of virus-induced pH change in SARS-CoV-2 infection [13].
Biophysical and structural analysis of the D614G S protein mutation that emerged in Spring 2020 indicates that it leads to greater availability of the RBD for ACE2 binding, through altered structural transitions [14]. These transitions include packing changes that resemble closed to locked form differences [15]. A model has been proposed in which the locked form is prevalent in the virus export pathway, giving protection of newly synthesised spike proteins at acidic pH, with the D614G mutation coupling to transition between locked and closed conformations [3]. In a complementary model, conformational change in the D614G structure has been proposed to suppress release of S1 from cleaved S1/S2 spike trimers, thus enhancing availability for ACE2 binding [15].
Understanding the factors that stabilise locked, closed, and open S protein trimers is important for vaccine design as well as the pathways of viral infection and biosynthesis. Engineering to stabilise spike trimers [16] is important as knowledge of the most effective conformational targets for vaccine production become apparent [17]. Molecular dynamics simulations are a common tool for structure/function analyses, particular those involving structural transitions. They have been extensively applied to SARS-CoV-2 S protein [18], including studies of receptor binding [19], drug binding to the linoleic acid binding pocket [20], the opening of cryptic pockets as potential drug targets [21], temperature effects [22], and the effect of glycosylation [23].
Specifically for pH-dependent effects, several computational methods are available. Constant pH molecular dynamics has been applied to coronavirus protease active sites, and reported on the protonation states that may be best suited to inhibitor design [24 25]. Atomistic simulations in general are mostly low throughput in terms of the starting coordinate sets that are studied. At the commencement of the current study, about 130 structures were available for SARS-CoV-2 S protein trimers. Since these represent substantial variability, for example locked, closed or open forms, D614 or G614, neutral pH or acidic pH, high throughput computational methods have a role to play, in this case to increase understanding of pH-dependent properties. Based on earlier work [26], a web server for pK a calculations has been benchmarked against known key sites for the acidic pH-dependence of influenza virus hemagglutinin [27]. This work was extended with a dataset of 24 spike trimers, collated in late Summer 2020 and mostly lacking what is now recognised as representatives of the locked form. A sparse set of groups predicted to couple with pH-dependence of closed to open form transition was found, focusing on the environment around a single salt-bridge [21].
The locked form is now more extensively characterised, with about 10 S protein trimers at the end of January 2021, as well as structures at mild acidic pH, and with the D614G mutation. Accordingly, the focus shifts to the transition between closed and locked forms in the current report, revealing a wider set of acidic groups (including D614) that are predicted to modulate stability at mild acidic pH. These groups are at interfaces that tighten in the locked form, relative to the closed form. Indeed, the change in burial of monomers within the S protein trimer is greater for the closed to locked transition, than for the closed to open transition. Predicted destabilisation in the locked form at neutral pH can be offset by either transition to the closed form or reduction to acidic pH. Alternatively, mutation of these acidic groups is predicted to stabilise locked form trimers at neutral pH.

Structural data
A set of 129 SARS-CoV-2 spike trimer structures was retrieved from the RCSB/PDB [28] in January 2021, from a list of list generated by probing the SARS-CoV-2 spike sequence, UniProt [29] id P0DTC2, with the Basic Local Alignment Search Tool (BLAST) [30] into the RCSB/PDB at the National Center for Biotechnology Infor-mation, and subsequent checking for SARS-CoV-2 amongst the coronavirus spike proteins returned. Non-spike binding partners were removed, leaving the 129 trimers and 387 monomers (spike387). The monomers of spike387 are listed in Supplementary  Table 1, along with assignment of RBD up or down according to PDB entry, and confirmed by visual inspection. Subsequent to analysis of spike387, sub-groups of locked, closed, and open S protein trimers were made, each trimer being uniformly RBD down (locked or closed), or RBD up (open). To maintain balance, 8 trimers were used for each sub-group, again with binding partners (ACE2, antibody fragments) removed. Further sub-groups were formed, consisting of: a single trimer with an engineered disulphide link (6zoz [31], termed diSlocked), 3 trimers at acidic pH [10] (pHlocked), and 8 trimers carrying the D614G mutation (D614Gset). Constituent monomers within trimers of the 6 sub-groups, along with RBD up or down designation, are given in Supplementary Table 2. A measure of model fit to cryo-EM map, the segment Manders' overlap coefficient (SMOC) was retrieved for each amino acid in each structure of the sub-groups, from analysis reported by the coronavirus structural task force [32]. An average SMOC value for each structure was calculated.

Structure-based calculations
Solvent accessible surface area (abbreviated to SASA) was calculated for S protein monomer burial within a trimer. For each monomer within a trimer (after non-spike components have been removed), SASA was calculated for the monomer alone and for the monomer within the trimer, the difference (d-SASA) giving the extent of burial. The contribution of polar and non-polar atoms to this burial was recorded, following previous methodology [35]. To facilitate comparison between sub-groups, d-SASA values were averaged over monomers within a sub-group. A key feature is that all calculations were referenced to an alignment of amino acid sequences for all monomers (spike387 and additional monomers added in subsequent formation of sub-groups). This ensures that all comparisons and heat maps are correctly aligned, allowing for engineered monomers and residues present in a sequence but disordered in a structure. Amino acid numbering throughout is given as the equivalent location in UniProt entry P0DTC2. Missing coordinates are not modelled, structure-based calculations use only the ordered residues reported in each structure. From vectors of 1 (ordered) and 0 (disordered) running along the sequence of each monomer, similarity to chain A of the locked conformation 6zb4 was calculated by vector dot product. Monomers were then listed in heat maps (of order/disorder or d-SASA) according to similarity with 6zb4A.
Electrostatics calculations used pK a predictions following the Finite Difference Poisson Boltzmann (FDPB) / Debye-Hückel (DH) hybrid method, termed FD/DH [26]. Ionisable groups that are not buried can sample both FDPB and DH schemes, and these calculations are combined with Boltzmann weighting. Since the latter is a simple water-dominated and highly damped estimate of electrostatic interactions, non-buried groups will not have high pK a devi-ations (4pK a s) from normal values, unless those are stabilising. On the other hand, buried groups, shielded from water-dominated interactions, can have 4pK a values that relate to substantial destabilisation as well as stabilisation. The method has been implemented in a web tool, following testing with sites known to determine the pH-dependence of influenza virus hemagglutinin stability [27]. Here, the focus is on predicted stability change between pH 7.5 (extracellular, cytoplasmic pH) and the lower pH experienced in import and export pathways (pH 5 was used as the lower value). This pH-dependent energy can be derived using the relationship [36 37]: where 4G pH-dep is the pH-dependent contribution to conformational stability (over the pH range of the integration, 7.5 to 5), differenced between two states. The states here are ionisable groups interacting in the FD/DH scheme, and the same set without interactions (null state with normal model compound pK a s). 4Q is the charge difference between those two states, R is the Universal Gas constant, and T is set at 298.15 K. Since the null state is uniform between structures, results for each structure can be compared to assess differential responses to pH change in the mild acidic range. Groups responsible for the predicted effects can be identified since 4Q separates into component ionisations. For 5 of the 6 structural sub-groups, monomers are either all RBD down or all RBD up, and in many cases will have been refined with imposed trimer symmetry. It is still useful to calculate on all monomers to assess the errors associated with numerical parts of the calculation (use of a finite difference grid and Monte Carlo sampling of protonation states). The set of predicted pH-dependent energies for each ionisable group reproduce closely, for example with correlation coefficients of 0.99 between monomers of 6vxx, indicating selfconsistency of the electrostatics calculations.
An additional method was used to calculate pK a s, in order to compare with FD/DH results. The empirical calculation tool PROPKA [38] implemented in PDB2PQR at the APBS server [39], was used (with default parameters) for the 6 sub-groups of spike proteins (36 trimers in total). Predicted 4pK a s from PROPKA were averaged for various single or groups of amino acids, in each of the 6 spike protein sub-groups.
Molecular viewers used to analyse spike proteins were NGL [40], Swiss PDB Viewer [41], and PyMOL. For representative trimer structures in the locked (6zp2), closed (6zp1) and open (7a98) subgroups, d-SASA values were transferred to the B factor column (capped at 99 Å 2 ), facilitating visualisation of monomer burial within a trimer.

Results and discussion
3.1. Differences in monomer burial between S protein trimer forms are more extensive than are order-disorder transitions Monomers within the spike387 set were classed as RBD up or down, noting that down does not distinguish between closed and locked trimers. Clustering of monomers was undertaken according to similarity of vectors representing order/disorder, and referenced to the A chain monomer within the 6zb4 locked trimer [4]. The resulting heat map of order/disorder is shown alongside a record of up or down RBD, and S protein domain structure (Fig. 1). Most clear-cut is ordering around position 840 in a set of structures with down RBDs and known to be in the locked form, at the top of the heat map. Other clusters of monomers, in terms of order/disorder, do not uniformly map to up or down RBD.
A similar procedure was used to map monomer burial within trimers. For each monomer, SASA loss upon integration of that monomer within the trimer was calculated (d-SASA). The order of monomers in the d-SASA heat map of Supplementary Fig. S1 is the same as that in Fig. 1. Summed d-SASA for each monomer is also shown. Known locked conformations mostly exhibit higher d-SASA than other conformations, including closed. The region around 840 in the d-SASA heat map differentiates the locked form again, but there is also a greater richness in differences across the sequence, particular for the RBD, reflecting a tightening of interfaces in the locked form relative to closed form. This analysis gives a view of detailed changes that complement molecular graphics comparison of locked and closed RBD down structures.

Relatively polar interfacial interactions differentiate locked from closed trimers
In order to more easily compare different conformational forms of spike trimer, sub-groups from spike387 were made with either all RBD down or all RBD up monomers. Three sub-groups (locked, closed, open) were matched numerically at 8 trimers (24 monomers) each. Other sub-groups were added: a single locked conformation trimer with engineered cysteines creating a disulphide link [31]  The ratio of non-polar to polar contributions to d-SASA was calculated for the S1 and S2 subunits of each monomer (Fig. 2b). For external protein surfaces not involved in protein-protein interactions the most common value of this ratio, in a distribution over patches, is about 1.2, increasing for more hydrophobic, interacting surfaces [35]. For the open sub-group, with least monomer burial, non-polarity increases for both S1 and S2 contributions. Conversely, monomer interface within a trimer becomes successively more polar for sub-groups that have higher d-SASA, with polarity for S1 d-SASA in the locked form close to that seen for noninterfacial surface (Fig. 2b). It can be concluded that transition from RBD up to RBD down, and further transition to the locked form, involves burial of surface that is not especially hydrophobic.
3.3. Acidic sidechains at tightened interfaces are predicted to have elevated pK a s Comparison of Fig. 3a and Fig. 3b reinforces the observation that d-SASA differences between S trimer sub-groups are more extensively spread over spike monomers than are order/disorder differences. The extent of d-SASA change between locked and closed forms, in comparison with that between closed and open forms, is also emphasised. In light of reported pH-dependent transition between S trimer conformations [3,10], pK a predictions were made for ionisable residues in trimer sub-groups. To directly address changes in the mild acidic pH range, the predicted contribution to change in conformational stability between pH 7.5 (cytoplasm, extracellular) and pH 5 (endosome) was calculated (summed and for each ionisable group). The results are added to Fig. 3 as a heat map of calculated per-residue pH-dependence, averaged over monomers within each sub-group (Fig. 3c). Towards the Cterminal part of S2 are several histidines [27,42], for which burial and predicted resistance to protonation from pH 7.5 to 5 leads to destabilisation in all sub-groups. Of more interest for differences between sub-groups are amino acids with acidic sidechains that have elevated pK a s and are predicted to be stabilising from pH 7.5 to 5, since destabilisation is reduced. Destabilisation derives from a local environment that suppresses ionisation at a pH above the normal pK a . Several Asp and Glu sidechains are of interest, particularly those for which the effect is exhibited in locked forms but not others, as a consequence of the increased burial. Summed values of predicted pH-dependence give a stabilisation of locked form sub-groups as pH falls from 7.5 to 5, but a destabilisation of other sub-groups over the same range (Fig. 4). The consequence of these predictions is a model in which the locked form is preferentially stabilised in low pH secretory pathway vesicles, whereas closed (unlocked) and open forms are stabilised at extracellular, neutral pH, as suggested from structural studies [3].  Looking at individual group contributions, 3 His in S2 (H1048, H1064, H1088) are buried and largely inaccessible to solvent throughout the sub-groups, so that they give a uniform destabilisation over the pH 7.5 to 5 range, as each His transits through the pK a (6.3) that would normally mark ionisation. These groups are not predicted to contribute differentially to pH-dependence of locked, closed, and open forms. Next, D398 was proposed to couple to pHdependence of closed (mostly unlocked) to open forms [42]. This result is replicated in the current larger study (Fig. 4), with a predicted D398 contribution to variation of pH-dependence between locked, closed, and open forms, but in the opposite sense to the overall effect. Two histidines have been suggested as potential mediators of locked trimer stabilisation at acidic pH [3], but no clear differential contributions of H49 and H519 are seen between sub-groups of structures, with the FD/DH calculation of pHdependent stability (Fig. 4).
It is of interest to examine whether structure resolution is a factor in the results shown in Fig. 4, in particular for the key predicted differences in pH-dependence between locked and closed sub-groups. A comparison of reported resolution and segment Manders' overlap coefficients for the 36 sub-group trimers Fig. 3. Acidic groups with increased burial in the locked form are predicted to give a mild acidic pH-dependence. (a) Order/disorder is illustrated with stacked histograms (1/ 0) for the 6 sub-groups of spike trimers. Differences of d-SASA between sub-groups are more extensive than are those of order/disorder, illustrated with cumulative plots of the closed form subtracted from the 5 other sub-groups (b). The magnitude of locked to closed difference is larger than that of open to closed. All sub-groups other than D614Gset consist of either all RBD down or all RBD up S trimers. The small difference between D614Gset and closed sub-groups reflects an average over substantial variation in D614Gset (Fig. 2a). (c) A heat map of predicted conformational stability change from pH 7.5 to 5, for each sub-group (labelled), with stabilising (red) or destabilising (blue) indicated. Amino acids that differ the most between sub-groups are indicated at the bottom of the plot, together with additional grey vertical bars aligned with the heat map. Panels ( Supplementary Fig. S2a) shows the general decline of SMOC values for better resolution structures that has been observed previously [43], and interpreted as the influence of map resolution on the level of detail required for a given fit of model to map. Given this rather complex relationship, a SMOC criterion was excluded from filtering for structure quality, leaving the reported resolution. Filtering at 3.0 Å resolution gives 5 of 8 locked form sub-group trimers and 4 of 8 closed form sub-group trimers. Predicted pHdependence for these resolution-filtered structures (Supplementary Fig. S2b) matches those given for the full locked and closed form sub-groups (Fig. 4), thus giving confidence in those calculations.
In order to provide context for this particular calculation method, the PROPKA empirical algorithm for pK a prediction [38] was employed, as implemented at the APBS/PDB2PQR server [39]. Here the pH-dependent stability contributions of individual groups are not available, so 4pK a values were used, which are positive for aspartic acid and glutamic acid destabilisation and negative for histidine destabilisation (Supplementary Fig. S3). Over different selections of groups, similar predicted effects are seen to those for the FD/DH method (Fig. 4). The 3 S2 subunit His (H1048, H1064, H1088) are predicted to be substantially destabilising in all S protein sub-groups. D398 and D614 are predicted to be destabilising, with similar trends between sub-groups to those in Fig. 4. The subset of groups labelled 14-DE refers to those Asp/ Glu labelled in Fig. 3c, but excluding D398 and D614 since they are shown separately. Locked, pHlocked and diSlocked all show 4pK a s greater than average Asp/Glu values, for 14-DE, whilst 4pK a s for closed and open forms are more comparable to average, with D614Gset intermediate. This is consistent with highlighting these acidic amino acids as being of interest in differentiating the predicted pH-dependent stabilities of locked and other S trimer forms. There is a difference between predictions of FD/DH (Fig. 4) and PROPKA (Supplementary Fig. S3) in the greater contribution of H49 and H519 according to the PROPKA method. Interestingly, H519 is in the proximity of D979, one of the highlighted acidic groups in Fig. 3c, and the mutation H49Y increases cell entry (relative to wild type), in an S-pseudotyped lentiviral system (although much less than D614G) [44]. Given these observations, it would be reasonable to keep H49 and H519 in mind when considering potential sites of pH-sensing in the S protein trimer.

D614G mutation and structural preferences
A feature of predicted pH-dependence for which the picture changes when locked (RBD down) structures are separated from closed (RBD down), is the influence of D614. Contributions of D614 to the calculated pH-dependence are seen for locked and pHlocked sub-groups, in a sense to stabilise those forms at pH 5 relative to pH 7.5 (Fig. 4). It has been reported that the S protein D614G mutation increases virus infectivity [45], and suggested that the mutation leads to a greater population of up relative to down RBDs due to loss of an inter-monomer latch formed by the sidechain of D614 [14]. Recent structural studies with an S protein ectodomain engineered with D614G, report that the increase in infectivity results from stabilisation of the S trimer against dissociation [5]. One of the structures (7krq) from that D614G study has all RBDs down, and d-SASA for monomers in this structure (Supplementary Fig. S4) are similar to those seen for locked forms (Fig. 2a), consistent with trimer stabilisation. Generally, the D614Gset subgroup exhibits a wide distribution of monomer d-SASA values, but these structures mostly have mixed up and down RBD within each trimer (Supplementary Fig. S4).
D614 is predicted to be relatively destabilising at neutral pH in locked and pHlocked forms (Fig. 3c, Fig. 4). Even after incorporating stabilising D614 sidechain interaction with K854, the reduction in solvent exposure due to ordering of the loop around residue 840 leads to a predicted average destabilisation, at pH 7.5 compared to pH 5 and averaged over the locked sub-group, of 7.7 kJ/mol. This implies a mean D614 pK a shift to about 6.0, from the intrinsic value of 4.0, consistent with wide-spread D614 -K854 salt-bridge formation in locked structures at pH 7.5 or 8. There are examples of this salt-bridge in all sub-groups (other than D614Gset), although at lower incidence outside of the locked sub-group. Our interpretation is that D614 -K854 interaction is solvent exposed and relatively weak outside of locked foms. Then, ordering of the loop around residue 840 in the locked form decreases solvent exposure, putting a greater onus on salt-bridge formation to reduce destabilisation of D614. Summed over interactions though, destabilisation of D614 is predicted to result for the locked form, and will be greater at pH 7.5 than at pH 5. This model is consistent with stabilisation of the D614G S trimer against dissociation at neutral pH [46]. Referring to the 3 structures in the pHlocked sub-group, for two (6xlu, pH 4.0 and 7jwy, pH 4.5) the 840 loop is ordered. In both of these, K854 is outside of salt-bridge range, consistent with elevated pK a and protonation of D614 at acidic pH.

Predicted locations of pH-dependence flank regions of known structural sensitivity
In common with D614, other predicted sites of pH-dependence are coupled to a tightening of monomer-monomer interfaces in the locked form S trimer (Fig. 5a), so that solvent exposure is reduced and Asp/Glu sidechain pK a s are elevated, with predicted destabilisation of the locked form as pH increases through the mild acidic pH range. Presumably, pH-independent features, such as linoleic acid binding [4], also couple to the energetic balance between locked and closed forms. For the pH-dependent component, it is notable that a tendency towards a locked form at acidic pH is still evident for the D614G S protein trimer [3], consistent with the existence of other pH-dependent sites (Fig. 3c, Fig. 5a). Whereas the D614 -K854 interaction is between monomers, neighbouring basic residues for other acidic sites of predicted pH-dependence, where they exist, tend to be from the same monomer. Nevertheless, the predicted effect is the same, it is the solvent exposure of the acidic group rather than the details of salt-bridging that is the key determinant of predicted pH-dependence. Several of the implicated acidic residues are present in only SARS-CoV-2 and SARS, of coronaviruses known to infect humans (Fig. 5b).
Display of d-SASA on S protein structure shows the expected difference between open (Fig. 6a) and closed (Fig. 6b) forms, and emphasises the difference (that is not evident by overall RBD location) between closed (Fig. 6b) and locked (Fig. 6c) forms. General location (Fig. 6d) and detail (Fig. 6e) are shown for residues that are present in the more compact interfaces of the locked form trimer, and which are predicted to contribute to pH-dependence over the mild acidic range. Some residues (D586, D614, D830, D843) flank regions of known importance for structural transitions (S1/ S2 and S2/S2 0 cleavage sites, fusion peptide), and other residues (D405, D420, E465, D979, D994) are adjacent to a site commonly engineered to generate more stable trimers [3]. Additionally, sites of predicted pH-dependence in the RBD are proximal to the linoleic acid binding site [4]. It is therefore suggested that members of this set of acidic residues (Fig. 5b, Fig. 6e), beyond D614, couple pHdependence to conformational biases of functional relevance.

pH-dependence generated through acidic groups at interfaces
In this model for pH-dependence of locked to closed form transition, a surface acidic group, whether or not paired with a basic group, loses substantial SASA on formation or tightening of an interface. The acidic pK a increases in the range pH 4 to 7. A bias against interface formation is introduced, which is relieved as pH reduced. Depending on the conformational details and interactions with neighbouring groups, a buried acidic group could be stabilising, but that is not predicted here.
Although functional ionisation of carboxylate sidechain groups at mild acidic pH is most well-known for acid-base catalytic systems, such as the elevated pK a of E35 in hen egg white lysozyme [47], many non-catalytic occurrences have been discussed. In The presence of these residues in a set of coronaviruses that infect humans (other than SARS-CoV-2) is shown with a stacked histogram. For example, E132 is present only SARS-CoV-2, whilst D994 is also present in the other 6 viruses. Fig. 6. Location of sites predicted to differentiate the mild acidic pH-dependence of locked and closed forms. Tube plots of spike protein backbone are shown with colourcoding from d-SASA of 0 Å 2 (yellow) to 80 Å 2 (red) for open/7a98 [9] (a), closed/6zp1 [31] (b), and locked/6zp2 [31] (c) forms. Within the regions of increased d-SASA in locked relative to closed forms are amino acids with acidic sidechains, shown in a coarse view with magenta tube plot (one of 3 symmetry-related representatives) on a background of spike trimer S1 (grey) and S2 (orange) subunits (d). This region is drawn in greater detail (e) with backbone tube and amino acid stick plots, noting that this one of the 3 representatives consists of regions from all 3 monomers (colour-coded tube). Sites of interest with regard to conformational change and stability within the trimer are also indicated. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) viruses, coiled-coil formation in a variant influenza hemagglutinin is pH-dependent between pH 4.5 and 7.1, possibly mediated by E69 and E74 [48]. A model for pH-dependent association of some antigenic peptides and class II proteins of the major histocompatibility complex suggests that buried interfacial Asp or Glu residues mediate this process, with pK a s higher than 7 [49]. Interfacial residues D612 and E613 of PsaB in photosystem I are proposed to be responsible for mild acidic pH-dependence of electron transfer in complexes with plastocyanin or cytochrome c 6 [50]. The periplasmic chaperone HdeA of Escherichia coli acts on a shift from neutral to lower pH via dimer dissociation and partial unfolding, these changes mediated by two aspartic acid residues [51]. In a further example, aspartic acid residues have been proposed as determining neutral to low pH conformational variation in an antigen-binding region of a therapeutic monoclonal antibody [52]. An acidic pHdependent stability has been engineered into staphylococcal nuclease with the mutation V66E, burying a glutamic acid (measured pK a 8.8) in the protein interior [53]. Here, the unfavourable stability consequence of glutamic acid burial is absorbed by the stability of the enzyme fold, with the addition of water molecules that penetrate to form a small channel around V66E [53]. For spike protein, it is hypothesised that unfavourable burial of acidic groups could also be coupled to other conformational factors (that are energetically favourable), including formation of the tighter locked form interfaces and linoleic acid binding. In general terms, there are some parallels with conformational transitions and pocket factor binding in picornaviruses [54]. Given the background of acidic sidechains mediating mild acidic pH-dependent effects, it is reasonable to investigate similar interactions when considering the pH-dependence of spike protein conformation, supplementing the assessment of histidine protonation in regard to pH-induced changes in viruses [55].

Conclusions
Analysis of spike trimer structures shows the expected large SASA change, for monomer burial, between closed and open forms, and an even larger change and interface tightening between closed and locked forms. A subset of acidic groups was identified in the locked form monomer-monomer interfaces that are predicted to mediate pH-dependent stability at mild acidic pH. These are located in regions that flank sites of known importance for stability engineering and/or function, such as cleavage sites and fusion peptide. D614 is amongst the groups identified and, in common with other acidic groups in the subset, is predicted to destabilise the locked form at neutral pH. This result is consistent with a report that the G614 mutation can exist in a form that resembles locked and stabilises the trimer at neutral pH, leading to suppression of S1 subunit loss and increased RBD -ACE2 binding [46].
The current work is also consistent with a model for pHdependent conformational masking of RBDs within the S protein trimer, derived from the 3 structures of the pHlocked set [10]. That study highlighted D614 and D586 (in common with our calculations), and also D574. If the prediction of a wider set of acidic groups relevant for pH-dependence is correct, then it might be asked why, from this set, only mutation at D614 has been seen widely seen in SARS-CoV-2 genomes. Of note though, a study of recurring spike protein mutations highlights several regions that overlap with, or are adjacent to, segments identified in Fig. 6e [56]. These mutations could influence folding and packing around the sites predicted to mediate pH-dependence. Since pH modulation of the D614G spike trimer conformation has been reported [3], there must be groups beyond D614 responsible for some degree of pH-coupling.
The recently discovered lysosomal exit pathway for newly synthesised SARs-CoV-2 virus provides a mild acidic environment, even including the observed moderate de-acidification [12]. Reduced acidification is thought to reduce activity of lysosomal proteases, thereby protecting exposed viral proteins from proteolysis, beyond the S1/S2 priming cleavage of S protein that confers an advantage for SARS-CoV-2 infection [57]. Stabilising the locked form at acidic pH is proposed to contribute to protection of the spike protein along the secretory pathway [3]. The current results are consistent with this model, extending the set of groups that may be responsible for pH-dependent stabilisation (beyond D614). Importantly, the balance between priming cleavage and stabilisation along the biosynthetic pathway is the focus for ongoing viral adaptation, including the observation that P681R mutation in the B.1.617 lineage optimises the furin cleavage site and could enhance transmissibility and pathogenicity [58].
An implication of the current work is that it should be possible to engineer more stable locked forms at neutral pH through mutation of the acidic groups, (e.g. Asp to Asn), at sites indicated in Fig. 6e. In line with the observation of close packing around G614, Asp to Gly changes could be tested, although the D614N mutation also leads to a more compact spike trimer [59]. Such analysis would probe the balance of stabilising and destabilising contributions to spike trimer structure during trafficking, and it could provide further insight into production of specific spike protein conformational forms for vaccines.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.