Uncovering the Role of N-Glycan Occupancy on the Cooperative Assembly of Spike and Angiotensin Converting Enzyme 2 Complexes: Insights from Glycoengineering and Native Mass Spectrometry

Interactions between the SARS-CoV-2 Spike protein and ACE2 are one of the most scrutinized reactions of our time. Yet, questions remain as to the impact of glycans on mediating ACE2 dimerization and downstream interactions with Spike. Here, we address these unanswered questions by combining a glycoengineering strategy with high-resolution native mass spectrometry (MS) to investigate the impact of N-glycan occupancy on the assembly of multiple Spike-ACE2 complexes. We confirmed that intact Spike trimers have all 66 N-linked sites occupied. For monomeric ACE2, all seven N-linked glycan sites are occupied to various degrees; six sites have >90% occupancy, while the seventh site (Asn690) is only partially occupied (∼30%). By resolving the glycoforms on ACE2, we deciphered the influence of each N-glycan on ACE2 dimerization. Unexpectedly, we found that Asn432 plays a role in mediating dimerization, a result confirmed by site-directed mutagenesis. We also found that glycosylated dimeric ACE2 and Spike trimers form complexes with multiple stoichiometries (Spike-ACE2 and Spike2-ACE2) with dissociation constants (Kds) of ∼500 and <100 nM, respectively. Comparing these values indicates that positive cooperativity may drive ACE2 dimers to complex with multiple Spike trimers. Overall, our results show that occupancy has a key regulatory role in mediating interactions between ACE2 dimers and Spike trimers. More generally, since soluble ACE2 (sACE2) retains an intact SARS-CoV-2 interaction site, the importance of glycosylation in ACE2 dimerization and the propensity for Spike and ACE2 to assemble into higher oligomers are molecular details important for developing strategies for neutralizing the virus.


Experimental Procedures
Cell Culture, Protein Expression, and Purification SARS-CoV-2 Hexapro Spike was obtained from Addgene (courtesy of Prof. Jason McLellan, Addgene 154754). The plasmid encoding residues 17 -726 of the ACE2 ectodomain with a C-terminal hexahistadine tag was cloned into phLSEC (a kind gift from Prof. Nicole Zitzmann). To knock out N-glycosylation at Asn432, a gene encoding the ACE2 ectodomain with T434A mutation was obtained from IDT and subcloned into the phLSEC backbone. Cell lines HEK293T and HEK293S GNTI -/cells were obtained from ATCC (CRL-3216 and CRL-3022, respectively). Cells were maintained in DMEM:F12 supplemented with 1X GlutaMAX, 10% fetal bovine serum, and 1x non-essential amino acids (Invitrogen). Fresh aliquots of cells were obtained for this study and were therefore not tested for the presence of mycoplasma. The evening prior to transfection, 8 -12 x 10 6 cells were plated in a T175 flask and grown overnight. The following day, the adherent cells were washed once with ~5 mL of PBS, and then transfected with PEImax (polyscience) using a 1:3 ratio of DNA:PEI. The media was brought to ~30 mL and protein expression was allowed to commence for 48-96 hrs. Supernatants from 10 -30 T175 flasks were pooled, clarified by centrifugation at 12,000 x g (4 °C, 30 min), passed through a 0.22 µM filter, and then either stored at -80 °C or used immediately. The ACE2 Asn432 knock out plasmid was transiently transfected into suspension adapted HEK293S GNTI-/-cells using 293Fectin (Life Technologies, Thermo Fisher Scientific) following the manufacturers' recommended protocol. On the day of transfection, ~2 x 10 6 cells/mL in 200 mL FreeStyle media (Life Technologies) was transfected as described, and protein expression was allowed to commence for 96 hrs. Supernatants were supplemented with imidazole to 20 mM before isolating the overexpressed proteins by Ni-NTA chromatography. The eluted proteins were concentrated to ~1 mg/mL using a 100 kDa MWCO (Spike) or 50 kDa MWCO (ACE2) centrifugal filter and further purified by size exclusion chromatography using a Superdex 200 Increase 10/300 column equilibrated with a buffer comprised of 20 mM Tris (pH 8.0) and 150 mM NaCl. The concentrations of each protein were determined using A280 measurements and calculated extinction coefficients determined by the Expasy Protparam suite (https://web.expasy.org/protparam/). Proteins were concentrated to ~10 µM, snap frozen in LN2, and stored at -80 °C until use.

Mass Spectrometry
Native mass spectrometry experiments were carried out using an Orbitrap Q-Exactive UHMR instrument as described. 1 Prior to analysis, proteins were buffer exchanged into 350 mM ammonium acetate (pH 7.4) using Micro Zeba Spin columns with a 40 kDa MWCO (Pierce). 1-3 uL of the analyte solution was loaded into gold-coated electrospray capillaries prepared in-house for native mass spectrometry analysis. Typical electrospray parameters were: ~1.2 kV ESI voltage, 80 -150 °C capillary temperature, 0.5 -1 mbar backing pressure. Low-energy collisions within the in-source trapping region (20 -50 V) and HCD cell were used to assist with thermalizing the high molecular weight ions: in-source trapping voltage: 20 -50 V, HCD energy: 20 V. Data were obtained in the positive ion mode at a resolving power of 25,000 (at m/z 200). For Spike-ACE2 binding experiments, the buffer-exchanged proteins were combined in the desired ratios and incubated for ~10 min at room temperature before analysis. The instrument was operated in "low mass" detection mode in the ACE2 dimerization studies. All other assays were carried out using the "high mass" detection mode. For equilibrium binding assays, the solutions were mixed by gentle pipetting and allowed to sit at room temperature for ~15 min to equilibrate prior to analysis.

Glycoproteomics
To prepare glycopeptides, ~15 μg of protein was treated with 8M urea for 10 min while periodically vortexing to induce unfolding. Disulfide bonds were reduced with 20 mM TCEP at 56 °C for 45 min. The reduced disulfide bonds were alkylated with 20 mM iodoacetamide in the dark for 1 hr. The urea was diluted to 1 M and sequencing grade trypsin (Promega) was added at a 1:20 w/w ratio. Tryptic peptides were generated by incubating the solutions overnight at 37 °C. Tryptic peptides were desalted using C18 stage tips (Pierce) and analyzed using an Orbitrap Eclipse Tribrid platform as reported previously. 2 Glycopeptide data was analyzed with Byonic and Byologic (Protein Metrics) using the search parameters described by Burnap and Struwe. 3 All glycopeptide assignments were manually validated. For quantification, the extracted ion chromatogram intensities for each glycopeptide and unoccupied peptides were summed and plotted relative to the total intensity for each glycosite. For proteins expressed from HEK293S GNTI -/cells, all glycopeptides were identified as high-mannose, consistent with previous glycomics work which showed that all N-glycans are Man5GlcNAc2 for Spikes and ACE2 proteins. 3 However, we opted to not restrict the glycopeptide databases used in searching these datasets; all 37 common N-linked glycans were searched as common modifications.

Data Analysis
Data were plotted in OriginPro. Native mass spectra in Fig 1 were modelled essentially as described 4 using a resolution parameter that corresponded to a peak width of 180 Th. The ACE2 spectrum in Fig 2A was deconvoluted using UniDec with default parameters except for the following that were empirically optimized: smoothing of 2; no basline subtraction; mass range of 6000 -8000; charge range of 5 -50; Mass Range 5000 -500,000; Sample mass every 10 Da; width 1 Th. The mass distribution was then plotted in OriginPro.
To generate the bar plots in Fig 2B, the relative quantity of each dimer glycoform resulting from self-association of each monomer Glycoform ( 1, 2, 3) was determined mathematically, akin to predicting isotope ratios. 5 Briefly, a polynomial expansion of the monomer glycoforms was used to identify the likelihood of observing a given dimer state.The peak areas for the three different monomeric ACE2 Glycoforms ( 1, 2, 3) were determined and then normalized to determine their probabilities ( 1 , 2 , 3 ), where 1 + 2 + 3 = 1.
G1 is the monomer glycoform with the lowest mass and G3 the highest mass. The six distinct dimer masses that result from pairwise combinations of G1, G2, and G3 were determined.
For example, To determine the probability of observing any dimer glycoform from the pairwise association of the pool of monomers, a polynomial expansion of monomer probabilities was determined: ( 1 + 2 + 3 )( 1 + 2 + 3 ) ( 1 2 + 2 2 + 3 2 + 2 1 2 + 2 1 3 + 2 2 3 ) For example, the probability of observing G1G1 is 1 2 . The probability distribution was plotted in Fig 2B as gray bars. To determine the dissociation constants (Kds) for ACE2 dimerization, the concentrations of all species were plotted against the total protein concentration and fit to an equilibrium binding model essentially as described. 6 The same procedure was followed to determine the overall (global) Kd for the formation of Spike-ACE2 complexes; the fractional signal of a respective complex was plotted against the respective protein concentration, and fit to a single site equilibrium binding model using OriginPro or a Python script. For the global Spike-ACE2 Kd, only the linear region of the binding isotherm was captured by the experimental data. An upper estimate of ~10 nM on the modelled Kd was therefore reportednearly identical to the values reported in literature using various techniques. 7,8,9 Nevertheless, we modelled isotherms assuming Kd values of 1 nM and 100 nM to demonstrate that the measured Kd is indeed near 10 nM. This exercise demonstrated that the peak areas for the different species identified in our native mass spectrometry assays captured the accepted behavior.
To determine the Kd1 and Kd2 for the formation of Spike-ACE2 and Spike2-ACE2, respectively, (2) the fractional signal in each complex was determined from the experimental data, where S, A, SA, and S2A are the fractional signal of free Spike, free ACE2 dimers, Spike-ACE2, and Spike2-ACE2 in each native mass spectrum. K d 1 and K d 2 are the equilibrium dissociation constants for SA and S2A, respectively.
In all experiments, the quantity of ACE2 and Spike could be accounted for using: A tot and S tot are molar concentrations of ACE2 and Spike in each experiment. Equations (1) and (2) were rearranged and substituted into (3) and (4) to generate master equations which allow for solving of SA and S2A at any concentration of S tot or A tot . It is difficult to accurately capture changes in signal abundances for ions < 8000 Th when transmitting and detecting large m/z ions (>12,000 Th). Therefore, we opted to hold the concentration of ACE2 constant. This made it straightforward to account for A, the fraction of unbound ACE2, across serial dilutions of S tot . In addition, we only quantified complexes that had resolved peaks corresponding to a charge state distribution that could be assigned. The master equations were encoded into Python and solved for K d 1 and K d 2 as described. 10 Fitting performed by minimizing the  2 value. After fitting, the goodness of fit was further validated by analyzing the uncertainties in the Kd values via bootstrap analysis with 10,000 replicates (Fig. S3).

Fig. S1
. N-glycan occupancy of Spike determined by glycoproteomic analysis. All Nglycans were identified as high-mannose.