Spike-heparan sulfate interactions in SARS-CoV-2 infection

Recent biochemical, biophysical, and genetic studies have shown that heparan sulfate, a major component of the cellular glycocalyx, participates in infection of SARS-CoV-2 by facilitating the so-called open conformation of the spike protein, which is required for binding to ACE2. This review highlights the involvement of heparan sulfate in the SARS-CoV-2 infection cycle and argues that there is a high degree of coordination between host cell heparan sulfate and asparagine-linked glycans on the spike in enabling ACE2 binding and subsequent infection. The discovery that spike protein binding and infection depends on both viral and host glycans provides insights into the evolution, spread and potential therapies for SARS-CoV-2 and its variants.

Introduction SARS-CoV-2 is a single-stranded RNA, positive-sense, enveloped virus that primarily targets the respiratory epithelial cells lining the upper and lower airways with evidence of spread to other organs. The spike (S) protein studs the outer membrane of the virus and mediates the initial steps of viral infection and spread. It acts by engaging cellular host receptors, such as angiotensinconverting enzyme 2 (ACE2) and neuropilin-1 (NRP1), and it undergoes processing via the transmembrane protease serine 2 (TMPRSS2) or other proteases [1]. Additionally, infection is facilitated by heparan sulfate (HS), a long negatively charged polysaccharide expressed by all animal cells [2e5]. Here we review current literature demonstrating the involvement of HS in SARS-CoV-2 infection. We discuss the mechanism of action from a structural perspective and explore the coordination between HS binding and asparagine-linked glycans.

The glycocalyx and viral infection
All cells are surrounded by a glycocalyx composed of various types of glycoconjugates, including asparaginelinked (N-linked) and serine/threonine-linked (Olinked) glycoproteins, glycosphingolipids, and proteoglycans. Given their location at the interface between cells and the extracellular milieu, it is not surprising that many viruses exploit glycoconjugates as attachment factors and receptors for infection, including influenza virus, herpes simplex virus (HSV), human immunodeficiency virus (HIV), and coronaviruses (SARS-CoV and MERS) [6e9]. Often the dependence of viral attachment on host glycans is obscured in vitro by the higheaffinity interaction with protein receptors, for example ACE2 for HCoV-NL63 and HVEM for HSV, which may be more exposed in cultured cells compared to their exposure in host tissues, such as the lung [10,11]. In fact, MERS binding to sialic acids was only described years after the discovery of its interaction with dipeptidyl peptidase 4 (DPP4) [12]. Glycan-binding may act as the initial step for cellular attachment, bringing the virus close to the epithelial cell membrane where it can interact with its protein receptor(s) (Figure 1). This general mechanism appears conserved among coronaviruses (CoVs) such as MERS, SARS-CoV, and HCoV-NL63, which are related to SARS-CoV-2 and bind via the viral spike protein to HS and other glycans [12e16]. The conservation of this mechanism across CoVs suggests that glycan attachment plays a role in CoV infectivity. Unfortunately, the glycocalyx is often ignored in studies of viral infection that focus only on protein receptors. This review focuses on the participation of cell surface HS and spike protein glycans in the interaction of the SARS-CoV-2 spike protein with cells and infection.

Structure of heparan sulfate
HS is a negatively charged, linear polysaccharide that assembles while attached to one or more of the 17 known transmembrane, GPI-anchored, or secreted proteoglycans ( Figure 2). These proteoglycans are expressed in a tissue-and cell-type specific manner. HS is assembled by the addition of alternating residues of N-acetyl-D-glucosamine (GlcNAc) and D-glucuronic acid (GlcA) to a primer tetrasaccharide attached to the proteoglycan core protein. During polymerization, the chains undergo processing reactions in which subsets of N-acetyl-D-glucosamine residues undergo N-deacetylation and N-sulfation; adjacent glucuronic acid units undergo epimerization at C5 to alpha-L-iduronic acid (IdoA); and ester-linked sulfate groups are installed at C2 of the uronic acids, at C6 of the glucosamine residues, and occasionally at C3 of the glucosamine residues. These modification reactions do not go to completion, giving rise to domains with variably sulfated sugars and different contents of iduronic acids interspersed by domains lacking all or most of these modifications. The variable length of the sulfated domains, their pattern of sulfation, and their spacing create unique motifs to which a variety of HS-binding proteins interact. These motifs are expressed differentially across cells and tissues, allowing for modulation of specific cellular functions [17]. Protein interactions with HS are driven largely by electrostatic interactions of the sulfate and carboxyl groups in these motifs with positively charged amino acids in the HS-binding domains of HS binding proteins and by hydrogen bonding. The variation in structure of HS across different tissues and cell types may impact tissue tropism of viruses and other pathogens [18]. Importantly, the structural features of HS are well conserved between species and thus may play a role in zoonotic transmission of viruses. Heparin, a Cartoon illustration depicting SARS-CoV-2 viral attachment on a host cell via initial binding to heparan sulfate proteoglycans (HSPGs), followed by ternary complex formation between SARS-CoV-2 spike protein, HSPG and ACE2 (highlighted with a rectangular box).

Figure 2
Representative example of heparan sulfate (HS) structure, sulfation pattern and 3D spatial geometry. Top panel: The SNFG representation of an HS 18-mer connected to the tetrasaccharide (glucuronic acid-galactose-galactose-xylose) linker that is attached to HSPG core protein (HSPG core protein shown in purple) [19]. A potential binding site for a fibroblast growth factor receptor and the high affinity binding site for antithrombin is indicated. Bottom panel: a molecular model, shown in stick form and in a biologically relevant conformation, reflecting the oligosaccharide in the top panel.
fractionated form of tissue HS isolated from bovine and porcine entrails, is highly sulfated and iduronic acidrich, binds to antithrombin and acts as a potent anticoagulant. Heparin is available commercially and is often used as a surrogate for HS, which has only recently become available through scientific vendors.

HS enhances SARS-CoV-2 infection
The impact of HS on SARS-CoV-2 infection has been demonstrated in vitro by different approaches: (i) genetic and enzymatic manipulation of cell surface HS [3,5]; (ii) molecular and biophysical interrogation of spike protein binding to HS [2,3,20,21]; (iii) structural characterization of putative HS binding sites in the spike protein as well as putative binding modes [3]; (iv) pharmacological intervention by inhibition of HS biosynthesis and competition by heparin and HS mimetics [3,5,21,22]; and (v) non-biased genome-wide CRISPR-Cas9 knockout screens aimed at the identification of critical host factors for infection [23e26]. HS enhances binding of recombinant SARS-CoV-2 receptor binding domain (RBD) and recombinant fulllength trimeric spike protein to cells, as well as infection by SARS-CoV-2 spike pseudotyped viruses and authentic SARS-CoV-2 and other seasonal coronaviruses to cells [3,24,25]. Comparable in vivo studies have not yet been performed, but the ubiquitous distribution of HS across tissues suggests that attachment to HS and possibly other glycans in the glycocalyx helps facilitate infection.
Clausen et al. showed that spike protein trimers bind to heparin and to more relevant cell surface HS. Binding to HS and ACE2 occurs in a cooperative manner [3]. Negative stain-electron transmission microscopy and image analysis showed that the binding of recombinant trimeric spike protein to HS enhances the conversion to the "up RBD" or "open spike," which is required for binding to ACE2 [3]. Moreover, binding studies using spike protein mutated to stabilize either the RBD "closed" or the RBD "open" confirmation showed that both bound to HS with similar affinities, but only the RBD "open" form bound to ACE2 [3]. These findings suggest a model in which the spike protein initially interacts with cell surface HS, resulting in an increase in the number of RBDs in the "up/open" conformation, which in turn leads to enhanced binding to ACE2 receptors and subsequent stabilization of the interaction of the trimer and the virion with cells [3]. When cells were genetically depleted of HS by CRISPR-Cas9 knock-out of the HS biosynthetic machinery, recombinant spike protein was significantly impaired in binding to the cell surface [3]. Similarly, a decrease in SARS-CoV-2 binding to the cell surface occurred when ACE2 was genetically depleted, and the residual ACE2-independent binding was shown to depend on HS. Other studies have confirmed that the spike protein can engage cells independently of ACE2 expression, although the efficiency of engagement was diminished [3,27,28]. Infection may occur independently of ACE2, possibly through micropinocytosis mediated by HSPG internalization [29]. These findings suggest that HS captures SARS-CoV-2 virions, increasing the local viral concentration and presentation of these virions to ACE2 and other proteinaceous receptors [27]. The interaction of virus with HS may facilitate tangential spread of viral progeny across tissues, even to tissues that do not express ACE2. In addition, HS is expressed across phylogeny, suggesting that the virus could, in a similar way, exploit HS in other susceptible animals, including bats, pangolins, mink, and others. The biochemical and cell culturebased evidence for the involvement of HS in SARS-CoV-2 infection is compelling, but additional experiments are needed to establish the involvement of HS in animal models.

Model of engagement of spike protein with heparan sulfate
Several groups have utilized computational methods to identify sites in the spike protein that engage HS [3,20,21]. These studies each identified different polybasic regions, consistent with the architecture of HS binding sites [17]. Taking a more holistic approach and looking at these proposed sites in sequence, suggests the presence of a long polyanion binding site starting at the RBD, running down between the RBD and the N-terminal domain (NTD), and down to the polybasic furin cleavage site (FCS) (Figure 3aed) [3,20,21,30]. Electrostatic potential maps of the SARS-CoV-2 spike surface indeed show that this proposed polyanion binding site is a positively charged channel connecting the RBD and FCS (Figure 3c). Furthermore, due to its trimeric nature, the native spike protein exhibits three of these polyanion binding sites creating the potential for highly flexible and multivalent spikeeHS binding modes (Figure 3aeb). Interestingly, the loss of an N-linked glycan at position N370 in SARS-CoV spike leaves a vacant binding site on the RBD primed to receive another oligosaccharide. This idea aligns with data suggesting that the proposed HS binding region in the SARS-CoV RBD carries less positive charge compared to the same site in the SARS-CoV-2 RBD, corresponding to reduced binding affinity to heparin/ HS [3].
This putative HS-binding site on the spike protein surface is maintained in the rapidly emerging SARS-CoV-2 variants of concern (VOCs), despite the multiple mutations that have occurred relative to the original circulating strain. In fact, mutations in the Delta (B.1.617.2) variant's spike protein sequence has resulted in three additional positively charged amino acids along the putative HS binding groove [31,32]. The most recent variant of concern, Omicron (B.1.1.529), was first identified by S gene dropout in PCR tests due to the more than 30 mutations per protomeric chain [33]. As in the case of the Delta variant spike, many of these mutations result in an overall increase in positive charge in the spike protein and four such mutations (mutations resulting in an increase in positive charge) lie along the putative HS-binding groove [34]. These observations suggest that the more infectious variants of SARS-CoV-2 have increased their potential for viral spread, possibly by increasing their affinity for HS. Additional modeling as well as biochemical studies are needed to confirm this hypothesis.

Coordination of heparan sulfate binding with spike N-linked glycans
The spike protein contains 22 N-linked glycans per protomer, forming a glycan shield around much of the protein and masking antigenic sites from the immune system. The shield arises from the intrinsic flexibility of N-glycans due to rotations around the glycosidic bonds (phi, psi, and omega angles) which, when modeled over time, gives the spike a characteristic "furry" look  Spike/heparan sulfate binding. ACE2 and the SARS-CoV-2 spike protein are displayed with yellow and cyan surfaces, respectively. The RBD is depicted with a transparent cyan surface. Spike protein and ACE2 are surrounded by a glycan shield generated by N-linked glycans illustrated with blue and dark gold sticks, respectively. HS is depicted with a per-atom-colored, stick representation: oxygen, red; hydrogen, white; nitrogen, blue; sulfur, yellow. Carbon atoms of N-acetyl-D-glucosamine residues are colored in blue, and carbon atoms of L-iduronic acid residues are colored in brown. a-b, Top and side view of the SARS-CoV-2 spike protein bound to HS. c, Electrostatic potential projected onto the RBD HS-binding patch with heparin bound. The surface is colored from red (negative) to blue (positive), representing electrostatic potential values of −4 k b T/e to +4 k b T/e. d, HS contributes to the formation of a ternary complex with the SARS-CoV-2 spike and ACE2.
RBD opening pathway, N165 interacts significantly with the RBD, engaging it as it moves up; in the final steps of this pathway, N165 swings under the RBD, taking its position beside N234 [37]. The HS binding patch on the spike RBD [2,3,20,21,32,34] overlaps significantly with the RBD binding site for N165 [35,36]. Thus, HS has the potential to replace N165 in this role, thereby modulating the well-timed RBD opening at the glycocalyx. Additionally, the handshake between the RBD and ACE2 is stabilized by two N-linked glycans at position N90 and N322 of ACE2 (Figure 4c) [38,39].
Interestingly, the N-glycan at N546 has also been shown to stabilize spike-ACE2 interactions via glycaneglycan interactions [39]. HS can also participate in this stabilization; as can be seen in Figure 4d, a ternary complex could easily be accommodated between the spike, ACE2 and HS.

Open questions
The discovery of HS as a key factor in SARS-CoV-2 infection raises many interesting questions. What are the molecular details of the SARS-CoV-2 binding and Interplay between N-glycans and heparan sulfate in priming the spike for infection. In all panels, ACE2 and the SARS-CoV-2 spike protein are displayed with yellow and cyan surfaces, respectively. The RBD is depicted with a transparent cyan surface. Glycans contributing to the glycan shield are illustrated with blue sticks. Hydrogen atoms have been hidden for clarity. a, The N-glycan at N343, highlighted in magenta, facilitates RBD opening, acting as a molecular crowbar. b, N-glycans at N165 and N234, depicted with orange and red sticks, respectively, stabilize the RBD in the "up" state, lockingand-loading the RBD for infection. c, N-glycans at N90 and N322 of ACE2, highlighted with yellow sticks, stabilize ACE2-RBD binding. d, HS, represented with per-atom-colored space filling spheres, modulate RBD opening and stabilization together with N-linked glycans (oxygen, red; hydrogen, white; nitrogen, blue; sulfur, yellow; carbons of N-acetyl-D-glucosamine residues, blue; and carbons of L-iduronic acid residues, brown).
infection mechanism? How is the complex interplay between HS adhesion, spike N-glycans and ACE2 binding coordinated? If the virus binds to the glycocalyx, how does it escape its grip to spread across tissues? Do the additional positively charged amino acid residues present in Delta and Omicron enhance binding to HS and if so, does this enhance transmission? If HS binding increases viral fitness, is it then a driving factor in the emergence of novel variants? Is it also a factor important for zoonotic transmission?
There is conflicting data concerning the accessibility and abundance of HS in the lung. Studies utilizing the anti-HS mAb 10E4 in immunohistochemistry have suggested that lung tissue and monolayers of human airway epithelial cells grown at an air-liquid interface may not express abundant HS on the apical epithelium [40,41]. However, studies utilizing other fixation methods [42] or mAb 3G10 indicate that HS is expressed on lung apical epithelium [43,44]. These articles illustrate an important issue about methodology in determining HS accessibility and structure, which could potentially affect tissue tropism.
The trimeric spike protein can potentially bind to three HS chains and coupled with the presence of w24 spike trimers/virion [45], the avidity of the interaction between HS and the virion must be extremely high. Typical HSPGs contain more than one HS chain, but whether a single proteoglycan can act as a scaffold for docking a trimeric spike protein is unclear. Most cells express more than one type of HSPG, raising the question whether SARS-CoV-2 prefers specific HSPGs. The differential expression of the HSPGs across cell types is relevant because it could explain in part the tissue tropism of SARS-CoV-2 [18]. Finally, HS chains have polarity, with the non-reducing end pointing away from the core protein. Thus, engagement of spike protein with HS on a target cell would occur in a spatially organized manner, with the HS chains extending from the top of the spike towards the virion envelope. The location of the HS-binding site on the opposite side of the RBD where ACE2 binds would allow each spike protomer to form a ternary complex.
Can insights into the interaction of SARS-CoV-2 spike with HSPGs aid in the development of targeted therapies to control the spread of COVID-19 and future coronavirus outbreaks? Various types of heparin, such as unfractionated heparin, enoxaparin and split glycol heparin (lacking anticoagulant activity) were shown to block spike or RBD binding or abrogate SARS-CoV-2 infection [2,3,5,21,27,46,47]. Application of computational studies and binding studies to glycan arrays have defined unusual HS sequences with high affinity for spike protein [48]. These HS variants differ in structure and therefore vary in their capacity to block HS-spike interactions and infection by authentic virus. A polysulfated glycan analog, Pixatimod (PG545), was recently shown to potently block SARS-CoV-2 infection in a K18-hACE2 mouse model. Pixatimod markedly attenuated SARS-CoV-2 viral titer and COVID-19-like symptoms [49]. These findings strongly suggest that heparin or heparin analogs could prove efficacious for blocking infection.
Another approach is to identify small drug-like compounds that block HS formation or binding. A drug screen of FDA-approved drugs identified several inhibitors of HS-dependent endocytosis [5]. Notably, Mitoxantrone was the most potent inhibitor, almost completely blocking HS-dependent endocytosis of virus at 5 mM. Halofuginone, a prolyl tRNA synthetase inhibitor, reduces HS expression and attenuates spike binding and viral infection, in part due to inhibition of HSPGs [22]. Surfen a small molecule antagonist of HSprotein interactions, can also block spike protein binding and viral infection in vitro [50]. Clinical studies are needed now to establish the utility of these agents to block viral infection in human patients.

Conclusions
SARS-CoV-2 spike has high affinity for HS. The SARS-CoV-2 spike protein is evolutionarily adapted to bind HS and evolution of the HS binding site may provide further advantage to recent SARS-CoV-2 variants of concern.
Modeling studies indicate that HS can bind an extended path along the spike protein surface, including regions of the receptor binding domain, the N-terminal domain, and the furin cleavage site HS binding induces activation of the SARS-CoV-2 spike, working in concert with spike N-linked glycans, to induce the "RBD-up" or "open spike" conformation, which is required for ACE2 binding. HS can stabilize the spike-ACE2 interaction by participating in formation of a ternary complex. Additional studies are needed to identify the relevant HSPGs that mediate binding and infection. Anti-virals based on sulfated carbohydrates or heparin analogs have not yet been developed, but clinical trials are underway to explore this approach for treatment of SARS-CoV-2. Conflict of interest J.D.E is a cofounder and T.M.C., and D.R.S. are consultants of Covicept Therapeutics, Inc. J.D.E. and the Regents of the University of California have licensed a university invention to and have an equity interest in TEGA Therapeutics, Inc., a vendor for heparan sulfate. The terms of this arrangement have been reviewed and approved by the University of California, San Diego in accordance with its conflict of interest policies.