Mechanism and evolution of human ACE2 binding by SARS-CoV-2 spike

Spike glycoprotein of SARS-CoV-2 mediates viral entry into host cells by facilitating virus attachment and membrane fusion. ACE2 is the main receptor of SARS-CoV-2 and its interaction with spike has shaped the virus’ emergence from an animal reservoir and subsequent evolution in the human host. Many structural studies on the spike:ACE2 interaction have provided insights into mechanisms driving viral evolution during the on-going pandemic. This review describes the molecular basis of spike binding to ACE2, outlines mechanisms that have optimised this interaction during viral evolution, and suggests directions for future research.

Introduction SARS-CoV-2 belongs to a subgenus of sarbecoviruses, for which Rhinolophus bats are considered to be the natural host [1,2]. Sarbecoviruses use their spike glycoprotein to attach to the host cell, and SARS-CoV-2 does so in humans by binding the receptor ACE2 (angiotensinconverting enzyme 2) [3,4]. The characteristics of the interaction between SARS-CoV-2 spike and human ACE2 has thus been crucial for SARS-CoV-2 emergence from an animal reservoir and has influenced viral evolution in the human population. Here, I describe the structure of SARS-CoV-2 spike, prerequisites for its binding to ACE2, and structural rearrangements in spike associated with this binding interaction. I then discuss evolutionary changes in SARS-CoV-2 spike that have optimised its binding to human ACE2, providing examples of specific adaptations that occurred between animal CoVs, the prototypic SARS-CoV-2 strain (Wuhan-Hu-1), and the more infectious SARS-CoV-2 "variants of concern" alpha, beta, and omicron, which emerged in the human population during the pandemic [5]. Finally, I frame some important outstanding questions in the field.
General structure of SARS-CoV-2 spike SARS-CoV-2 spike is a class I fusion glycoprotein made of three protomers, each containing an S1 subunit, responsible for receptor binding, and an S2 subunit, responsible for viral membrane fusion (Figure 1a). The three S1s and the extracellular portions of the S2s adopt an arrangement that defines the pre-fusion conformation of the spike ectodomain [3,6e8]. Upon ACE2 binding, the pre-fusion spike undergoes conformational rearrangements that promote the S1:ACE2 complex dissociation ('shedding') from S2 to drive membrane fusion, eventually assuming a stable post-fusion conformation [8] (Figure 1a). Cleavage of the ectodomain at several sites by host proteases appear to activate spike for these transitions [7,9,10]. The cleavage sites are located in S1, S1/S2 boundary, and S2 and are acted upon by host proteases, such as furin (during spike biogenesis), TMPRSS2 and/or cathepsins (during viral entry).
Each spike subunit is made of several domains (Figure 1b). S1 contains two large domains: the Nterminal domain (NTD or domain A), a beta-sandwich of a galectin-like fold, and the receptor binding domain (RBD or domain B), made of a core formed by five antiparallel b-sheets and a large loop known as the receptor binding motif (RBM). While RBD binds ACE2 [3,11e13], both NTD and RBD have been shown to interact with glycans [14,15], possibly to facilitate viral attachment. NTD and RBD are linked by residues 291e330 which, together with the C-terminal part of S1, form two smaller 'subdomains': the NTD-associated subdomain (NTD-s, aka domain D/ CTD2/SD2) and the RBD-associated subdomain (RBD-s, aka domain C/CTD1/SD1). These subdomains mediate conformational changes in S1 11,16 . S2 is made of several long helical regions that undergo large conformational rearrangements to drive membrane fusion [8,17].
In the pre-fusion form of spike, S1 and S2 subunits make an extensive network of interactions ( Figure 1a). The three S2s form a long coiled-coil 'fusogenic core' of the protein, while three S1s are located at the membranedistal 'top' and interact via their RBDs to shield the General architecture of SARS-CoV-2 spike (a) Main conformations of SARS-CoV-2 spike trimer: pre-fusion (left, S1 and S2 subunits, PDB ID: 6zge) and postfusion (just S2 subunits, PDB ID: 6xra), with each protomer shown in different colour (b) Domains of S1 subunit with matching residue boundaries on the top: N-terminal domain (NTD aka domain A) in yellow, receptor-binding domain (RBD, aka domain B) in rosy brown, NTD-associated subdomain (NTD-s aka SD2, CTD2, domain D) in blue, RBD-associated subdomain (RBD-s aka SD1, CTD1, domain C) in plum (c) Conformational changes in S1 (blue) and S2 (red) subunits of one spike protomer associated with a transition from closed (PDB ID: 6zge, left) to open (PDB ID: 6zgg, middle) to ACE2bound (PDB ID: 7a94, right; ACE2 in green) conformations. fusion machinery. Each S1 engages closely with the S2 0 of the neighbouring protomer, whilst making lesser interaction with the two other S2s. Such staggered protomer arrangement leads to concerted conformational changes across the spike trimer including those associated with ACE2 binding and membrane fusion [11].

Spike conformations
Spikes of sarbecoviruses are generally able to assume multiple conformations. The pre-fusion conformations can be broadly categorised into closed or locked (characterised by inaccessible RBMs, and thus unable to bind ACE2), open (characterised by erect RBD(s) with accessible RBM(s)), and ACE2-bound (Figures 1c and 2a). Between these archetypal states there exist intermediate conformations, including mixed states within a single trimer and metastable 'transition' conformations.
How does SARS-CoV-2 spike evolve to bind ACE2 Wrobel 3 destabilise the S1eS2' interaction leading to increased mobility of residues 828e855, known as the fusionpeptide-proximal region (FPPR) [8]. Since FPPR is close to both the fusion peptide (recently visualised for the first time with cryoEM and mapped precisely to residues 866e909 [24]) and the R815 protease site (whose cleavage is thought to modulate fusion activity [25]), the subdomains link spike opening and receptor binding to membrane fusion [11]. Conversely, factors that affect structure and dynamics of the two subdomains likely modulate the ability of spike to open, bind ACE2, and fuse.
Three spike conformations with inaccessible RBMs have been identified (Figure 2a). The first to be described, termed 'closed', buries w4000 Å 2 intraprotomer surface area, features partially disordered NTDs and FPPRs, and unstructured 615e640 loops [3] (Figure 2b). We and others subsequently reported the 'locked' conformation with more intimate intraprotomer interaction (> 6000 Å 2 ), ordered NTDs and FPPRs, and partially ordered 615e640 loops  (Figure 2b), which inhibits these spikes from opening. Qu et al. recently showed that disulphidestabilised SARS-CoV-2 spike can also assume this second helix-turn-helix-locked conformation and suggested that the two locked conformations are more stable during spike biogenesis at pH < 7 but less stable in neutral pH [23]. The transition from locked to closed conformations is associated with an outward movement of the protomers away and anticlockwise from the central symmetry axis causing the whole ectodomain to 'dilate' which, together with unfolding of the 615e640 loops, primes spike for a subsequent transition to open conformations.
The open conformations of spike are characterised by at least one of the RBDs being erect with its RBM available for ACE2 binding. To achieve this receptorbinding-enabled conformation, the RBD needs to undergo a w60 rotation away from the spike central axis breaking interactions with the other RBDs and the S2 core and to then be stabilised by its rigidified RBD-s and new interactions with the proximal NTD and RBD [3,7]. Open conformations with two [28,29] ( Figure 2a) and even three erect [30] RBDs have been also observed by cryoEM (but less frequently as erect RBDs cannot stabilise each other).
ACE2 binding induces further conformational changes in the spike trimer [11,21,31,32]. Upon engaging ACE2, the erect RBD shifts further away from the trimer symmetry axis, while the 615e640 loop of its protomer refolds into a stable helical arrangement away from the neighbouring FPPR [11] (Figure 2b). This coincides with FPPR becoming fully unstructured which may prime S2 for subsequent conformational changes. Moreover, cryoEM [11,32], FRET [33], HDX-MS [21], and MD [31] studies suggested that receptor binding is synergistic. Thus, some of the energy available from a formation of the RBD:ACE2 interface facilitates opening of the remaining RBDs to unsheathe the fusogenic core and may also help to prime S2 for subsequent ejection of the fusion apparatus.
Receptor binding culminates in dissociation of S1:ACE2 complexes to fully expose the trimeric S2 core, allowing it to undergo the large conformational changes associated with fusion. This step is accompanied by structural rearrangements within dissociating S1:ACE2 complexes, mainly a w90 rotation of the NTD and NTD-s with respect to the RBD-s and RBD:ACE2 [11] (Figure 2c). Similar rearrangements have been observed upon S1 binding to an antibody CR3022 [34,35], which causes spike de-trimerisation, suggesting that this conformation with rotated NTD and NTD-s represents a lowenergy state of complexed S1 in solution. As this conformation is incompatible with the pre-fusion conformation of the whole trimer (and thus S1 cannot bind back the fusogenic core), its adoption leads the progression of viral entry stages from receptor attachment to membrane fusion. MD simulations with membrane-attached ACE2 and spike also suggested that effective uncovering of the S2 core depends on the cleavage at the S1eS2 boundary [31] and recent structural studies suggest that S1eS2 cleavage and ACE2 binding are sufficient for effective fusogenic transition [24]. Nevertheless, some spike trimers seem to transition to post-fusion conformation in the absence of these factors, as such spikes have been observed on purified SARS-CoV-2 virions with cryoET [36,37].

Optimising spike: ACE2 interaction
It can be posited that three main mechanisms that have governed the evolution of spike:ACE2 interaction are ( Figure 3): (i) increasing RBM exposure (and thus receptor accessibility); (ii) enhancing spike stability upon receptor binding to prevent premature transition to post-fusion conformation; and (iii) increasing spike affinity for human ACE2.

Mechanisms optimising RBM exposure
The more open spike is, the more available its RBDs are for ACE2 binding. CryoEM studies showed spikes of early variants of SARS-CoV-2, such as B.1, alpha, and beta, were more open than prototypic spike [18,28e30,38e41], while all spikes of animal sarbecoviruses studied to date were observed in locked conformations [7,20,27]. Interpretation of cryoEM studies on purified spikes requires some caution, as several reports showed, for instance, that spikes appearing fully closed on an EM grid can still engage ACE2 in binding assays [20,22,27], which likely speaks to energetic barriers between these states. Nevertheless, cryoEM studies remain an important tool to identify trends in spike openness, as confirmed with methods such as FRET [33,42] and HDX-MS [21,43].
All other things being equal, amino acid substitutions that destabilise the locked and/or closed conformations will shift the equilibrium towards more open conformations. For instance, D614, located in NTD-s, makes a salt bridge with K854 of neighbouring FPPR in prototypic spike [8,11,19] (Figures 2b and 3a). The D614G substitution, in all SARS-CoV-2 variants of concern, removes this interaction, destabilising both locked and closed conformations. Substitutions within the S1:S1 interface can likewise affect spike opening: removal of the negative charge by E484 substitutions in omicron and beta spike destabilises the RBM:RBM packing at the vertex of the locked and closed conformations [30, 40,44], while K417N substitution in beta and omicron spikes disturbs the side-to-side packing of their RBDs [29,44] (Figure 3b). In fact, K417N seems to be so favourable for spike opening that it is retained in recent omicron variants despite its small weakening of RBD:ACE2 affinity [29,45]. Conversely, R417 of GDpangolin-CoV spike likely contributes to its full locking by optimising the RBD:RBD packing within the trimer [20] (Figure 3c). Other mechanisms influencing spike opening probably also contributed to SARS-CoV-2 emergence. The S1eS2 furin cleavage site ( 682 RRAR 685 in SARS-CoV-2) is absent in closely-related animal-CoV spikes [46,47] and we have demonstrated that furin cleavage promotes SARS-CoV-2 spike opening [7]. A recent study by Zhao et al. [9] showed the same is true for spikes cleaved with cathepsin (at residue 636). Conversely, Gupta et al. demonstrated that spikes lacking the furin-cleavable loop 679e687 assume only the locked conformation [22]. As both furin and cathepsin have their main cleavage sites at NTD-s, protease cleavage appears to affect the dynamics of this subdomain and its allosteric axis with RBD, which was corroborated by MD [22]. Glycosylation of N370, present in animal-CoVs but not SARS-CoV-2, was also suggested to prevent spike opening [48e50]. However, its importance is yet to be confirmed as, while introducing it into prototypic SARS-CoV-2 made its spike more closed [49], its removal did not cause the fully closed spikes of bat and pangolin CoVs to open [27] and it is present in SARS-CoV spike which opens normally.

Mechanisms stabilising spike for efficient ACE2 interaction
Stabilising spike upon opening and ACE2 binding allows the virus to avoid prematurely releasing its S2 apparatus. Substitutions increasing spike stability have been mapped to S2:S2 and S2:S1 interfaces. For instance, in alpha spike, D1114H forms a histidine triad stabilising the S2 core We also demonstrated that alpha spike showed further enhanced stability (less de-trimerisation) upon ACE2 binding, compared to prototypic and beta spikes so long as it was cleaved by furin [29]. We have thus proposed that mutations enhancing furin cleavage, such as P681H (alpha and omicron), P681R (delta), and N679K (omicron), improve spike stability upon ACE2 binding. While a direct molecular explanation for this effect is lacking, changes in NTD-s dynamics upon cleavage seem to enhance not only spike opening but also its stability; in line with this, D614G-only spike also shows higher stability and reduced S1 shedding compared to prototypic spike [38,60] and can accommodate more ACE2 molecules when trypsin-cleaved [61].

Mechanisms increasing spike affinity towards ACE2
The RBM of SARS-CoV-2 spike binds the large membrane-distal face of human ACE2 (represented mainly by its long N-terminal helix: residues 19e54). The interaction was mapped to three hotpots centred around ACE2 residues E35 and K31, D38 and K353, and M82 [12,13,62] (Figure 4a). Overall, twenty ACE2 residues found in proximity to these hotspots are involved in interactions with seventeen residues of the prototypic spike RBM, forming an extensive network of hydrogen bonds, salt bridges, and hydrophobic contacts burying w1700 Å 2 . Optimisation of this interaction was demonstrated in SARS-CoV-2 variants and was likely also a prerequisite for a host jump of the ancestral virus from an animal reservoir.
A number of substitutions within SARS-CoV-2 RBM can enhance its interaction with human ACE2. Substitutions that directly resulted in new interactions within the RBM:ACE2 interface notably include N501Y (alpha and omicron), which binds ACE2 Y41 and K353 [29,53,63] (Figure 4b), and Q493R and Q498R (omicron), which form salt bridges to E35 and D38 [53, 64,65] (Figure 4c). Substitutions that increased spike:ACE2 affinity but had smaller effect on the interface structure have been also observed: Y453F in mink spike shows enhanced neighbouring interactions likely because of reduced local solvation [29,66]; T478K and L452R in delta are not directly involved in interaction, but positively affect the electrostatic complementarity of the interface [67] and the RBM stability [68]; and R346K in omicron BA.1.1 refolds the whole network of intra-RBD interactions providing a longdistance effect on the RBM [63]. Finally, while most reports agree that the D614G alone does not directly affect the affinity of spike towards ACE2 [18,29,38], it appears to be a prerequisite for the affinity enhancement provided by substitutions on the ACE2:RBM interface [29]. This demonstrates the directional nature of viral evolution and the complex interplay of different mechanisms that shape it.
Optimising the interaction with human ACE2 was likely key for SARS-CoV-2 zoonosis. The severe SARS-CoV outbreak in 2002e2003 was traced back to a single substitution K479N (corresponding to Q493 in SARS-CoV-2) that enhanced virus binding to human ACE2 allowing an effective species jump from an intermediate host [69,70]. By contrast, we know very little about the origin of SARS-CoV-2. Spike of bat-CoV RaTG13 is the most similar to SARS-CoV-2 but has been repeatedly shown to assume only closed conformation [7,27] and to bind human ACE2 orders of magnitude weaker than SARS-CoV-2 [7,27, 45,71]. This is likely due to suboptimal contacts made by residues L486 and Y493 (both hydrophobic while charged in recent SARS-CoV-2 variants), and especially D501 (charged, while aromatic in alpha and omicron variants) [12, 45,71] (Figure 4d), which recent studies identified as essential for ACE2 specificity [45,72]. However, several recent reports on newly isolated bat-CoVs illuminate SARS-CoV-2 evolution and receptor binding [46, 47,73]. While they lack the furin cleavage site, bat-CoVs BANAL-52 and BANAL-103 have S1s very similar to that of SARS-CoV-2 and RBDs almost identical to it, except for a mismatch H498Q, and their RBDs bind human ACE2 with an affinity of only one third of that of SARS-CoV-2 [46]. However, there have been no structural studies on the whole spikes of any of these bat-CoVs.

Conclusions and perspectives
The recent pandemic has provided an unwanted, yet unique, opportunity to study viral evolution in real time. I have postulated that evolution of SARS-CoV-2 interaction with human ACE2 involved optimisation of spike availability for the receptor engagement, its stability, and receptor affinity. However, not every spike emerged under pressure from all these mechanisms combined and there were cases when they had opposite effects. For instance, K417N in beta spike increased its RBD availability (openness) but reduced its affinity towards ACE2, while multiple mutations in omicron spike, which made it more stable upon ACE2 binding, reduced its RBD availability [58,59]. Moreover, undoubtedly, these were not the only mechanisms shaping spike evolution as immune responses and membrane fusion, among others, certainly also had impact. However, to fully understand the spike:ACE2 interaction, further mechanistic studies are needed in a number of areasdtwo of which I identify below.
First, it remains unclear what exactly happens to membrane-embedded spike upon ACE2 binding and what the prerequisites for, and steps leading to, membrane fusion are. Indeed, most structural and biophysical studies have used soluble spike constructs, heavily stabilised in pre-fusion conformation and uncleaved by proteases. Given the importance of furin, TMPRSS2, and endosomal proteases for SARS-CoV-2 infection, as Substitutions affecting ACE2:spike interface (a) Interactions between human ACE2 (green) and prototypic (wt) SARS-CoV-2 RBM (magenta). All residues mentioned in the text are shown as sticks; the three main hotspots on ACE2 are boxed (PDB ID: 7a91) (b) Interface around residue 501 between ACE2 in green (wt, PDB ID: 7a91)/olive (alpha, PDB ID: 7r0z) and RBM in magenta (wt, PDB ID: 7a91)/brown (alpha, PDB ID: 7r0z) (c) Interface around two hotspots between between ACE2 in green (wt, PDB ID: 6m0j)/olive (omicron, PDB ID: 7zf7) and RBM in magenta (wt, PDB ID: 6m0j)/brown (alpha, PDB ID: 7zf7) (d) Comparison of ACE2 binding by omicron spike RBM residues (PDB ID: 7zf7, brown) and RaTG13 (PDB ID: 7drv, lilac).
well as the recent observation on importance of acidification for viral entry [74], more studies on fusioncapable spikes are needed to understand how these factors combine with ACE2 binding to prime spike for fusion and what are the exact conformational changes associated with it.
Second, it is unclear how the ability of SARS-CoV-2 spike to recognise human ACE2 emerged. Recent studies on bat-CoVs offer new insights into the emergence of the RBM. However without structures of whole bat-CoV spikes and spike:receptor complexes, we do not know how these spike may differ, whether they are capable of opening, how stable they are, what proteases cleave them, and what they require for fusion. Moreover, our understanding of how these spikes facilitate CoV infection in bats is also limited. Rhinolophus sp. show remarkable diversity of the whole N-terminal helix of ACE2 within the genus which could suggest that this region is under constant evolutionary pressure [75]. Moreover, while recent reports have suggested that ACE2 binding is an ancestral trait of sarbecoviruses [72], many studies showed that ACE2s derived from various Rhinolophus species do not allow entry of many sarbecoviral species [75e77]. This, combined with no apparent evolutionary pressure on bat-CoV spikes to increase their opening, especially when compared with SARS-CoV-2, suggests that ACE2 may not be the receptor for at least some sarbecoviruses, thus highlighting the necessity for future mechanistic studies on bat-CoV spikes.
Finally, as spikes of many coronaviruses from different families independently evolved the ability to open, often while binding different receptors to ACE2 [1,2], there appear to be some universal molecular determinants defining conformational flexibility of CoV spikes. Revealing those and further understanding evolutionary trajectories driving opening and receptor selection at the molecular level may help us better predict future zoonotic events and design universal therapeutic interventions against coronaviruses.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this article.

Data availability
No data were used for the research described in the article.