Three Amino Acid Changes In Avian Coronavirus Spike Protein Allows Binding To Kidney Tissue.

Infectious bronchitis virus (IBV) infects ciliated epithelial cells in the chicken respiratory tract. While some IBV strains replicate locally, others can disseminate to various organs, including the kidney. Here we elucidate the determinants for kidney tropism by studying interactions between the receptor binding domain (RBD) of the viral attachment protein spike from two IBV strains with different tropisms. Recombinantly produced RBDs from the nephropathogenic IBV strain QX and from the non-nephropathogenic strain M41 bound to the epithelial cells of the trachea. In contrast, only QX-RBD binds more extensively to cells of the digestive tract, urogenital tract, and kidneys. While removal of sialic acids from tissues prevented binding of all proteins to all tissues, binding of QX-RBD to trachea and kidney could not be blocked by pre-incubation with synthetic alpha-2,3-linked sialic acids. The lack of binding of QX-RBD to a previously identified IBV-M41 receptor was confirmed by ELISA, demonstrating that tissue binding of QX-RBD is dependent on a different sialylated glycan receptor. Using chimeric RBD proteins, we discovered that the region encompassing amino acids 99-159 of QX-RBD was required to establish kidney binding. In particular, QX-RBD amino acids 110-112 (KIP) were sufficient to render IBV-M41 with the ability to bind to kidney, while the reciprocal mutations in IBV-QX abolished kidney binding completely. Structural analysis of both RBDs suggests that the receptor binding site for QX is located at a different location on the spike than that of M41.Importance: Infectious bronchitis virus is the causative agent of Infectious bronchitis in chickens. Upon infection of chicken flocks, the poultry industry faces substantial economic losses by diminished egg quality and increased morbidity and mortality of infected animals. While all IBV strains infect the chicken respiratory tract via the ciliated epithelial layer of the trachea, some strains can also replicate in the kidneys, dividing IBV in two pathotypes: non-nephropathogenic (example IBV-M41) and nephropathogenic viruses (including IBV-QX). Here we set out to identify the determinants for the extended nephropathogenic tropism of IBV-QX. Our data reveal that each pathotype makes use of a different sialylated glycan ligand, with binding sites on opposite sides of the attachment protein. This knowledge should facilitate the design of antivirals to prevent coronavirus infections in the field.

affected depend primarily on the IBV strain (2). Phylogenetic classification of IBV strains results in 32 phylogenetic lineages (GI-1 to GI-27 and GII to GVI) (3), of which GI-1 includes historically the first IBV genotype identified, Massachusetts (IBV-Mass). IBV-Mass infections are reported worldwide, and in Europe, GI-1 is currently the 3rd most prevalent genotype (2). The more prevalent IBV genotype circulating in Europe is IBV-QX (GI-19) (2, 3), which has been reported to cause kidney disease in contrast to IBV-Mass (2).
IBV primarily infects the respiratory tract, where the virus can bind and infect the ciliated epithelial lining of the trachea (4,5). Upon infection of IBV, clinical symptoms such as snicking, wheezing, and/or nasal discharge are reported (6). While infection of IBV-Mass (of which strain M41 is the prototype) is predominantly detected in the upper respiratory tract (7) including the trachea (2), replication of IBV-QX is additionally found in the kidneys (7)(8)(9), oviduct, and the gastrointestinal tract (10,11), leading to additional clinical symptoms like swollen proventriculus (12) and reduction of egg production (13,14). Because of these additional clinical symptoms, IBV-QX is described as a nephropathogenic IBV strain (2).
Binding to host tissues is the first step in the viral life cycle of IBV and therefore a critical factor in determining tissue tropism. Tissue tropism differs based on the amino acid composition of the spike protein as shown by recombinantly produced proteins (15)(16)(17) and infection assays with recombinant viruses (18). The spike of IBV is posttranslationally cleaved into two subunits, S1 and S2, where S2 is anchored in the virus membrane and important for membrane fusion. S1 comprises the head domain of spike and is responsible for host receptor binding (19). Using recombinantly expressed M41-S1 proteins, alpha-2,3linked sialic acids were identified as the IBV receptor on a glycan array, where specific binding to the ligand Neu5Ac␣2-3Gal␤1-3GlcNAc was observed (19). Recently the cryoelectron microscopy (cryo-EM) structure of the M41 spike has been resolved (20), indicating that the S1 subunit consists of two independent folding domains, the N-terminal domain (NTD) (amino acids 21 to 237) and C-terminal domain (CTD) (amino acids 269 to 414), with a proposed receptor-binding site in both domains. Experimental evidence using recombinantly expressed spike domains has indicated that amino acids 19 to 272 of the M41 spike are sufficient for binding to trachea as well as binding to alpha-2,3-linked sialic acids (15). This domain thus contains a receptor-binding domain (RBD) and can be used to study the biological implications of genetic variation in circulating IBV genotypes.
In this study, we set out to identify how genetic variations in IBV spike proteins have contributed to different host tropisms. We demonstrate that QX-RBD binding to trachea and kidney is dependent on a different sialylated glycan ligand compared to M41-RBD. In particular, introduction of amino acids 110 to 112 (KIP) of the QX spike into M41-RBD was sufficient to extend its tropism toward the kidney. Previous docking experiments (17) and structural analysis suggest that the binding pockets for the different glycans are located at opposite sites of each spike protein.

RESULTS
The N-terminal domain of IBV-QX spike contains a receptor-binding domain. Eighty-five percent of the amino acids between the sequences of the first 257 amino acids of IBV-QX and IBV-M41 are either identical or similar. Here, we set out to determine which of the dissimilar amino acids are the determinants for the difference in tissue tropism.
In previous work, we demonstrated the M41-RBD was sufficient to bind to chicken trachea (15). To verify that no additional sites are present in M41 that could bind to kidney or trachea tissue, we produced recombinant proteins consisting of the full ectodomain (ED), the S1 portion of the ED, the RBD (NTD of S1), and the CTD of S1. Each protein was assessed for binding using trachea and kidney tissue slides. Binding to trachea tissue was observed using M41-ED, S1, and RBD but not CTD to ciliated epithelium of the trachea, specifically located at the base of the cilia (Fig. 1), confirming previous observations (15,19,21). None of the proteins bound kidney tissue, which is shown by a representative picture using M41-RBD (Fig. 1B). Binding affinity to the known ligand (Neu5Ac␣2-3Gal␤1-3GlcNAc) in enzyme-linked immunosorbent assay (ELISA) was observed using M41-RBD, M41-S1, and M41-ED, not significantly different when compared to each other but significantly higher compared to those of M41-CTD and turkey coronavirus (TCoV)-S1 (Fig. 1C). These results indicate that ligand binding of M41-RBD is not significantly different compared to that of M41-S1 and M41-ED, suggesting no additional ligand-binding motifs are present in S1 and ED; thus, in the remaining experiments, we used M41-RBD, as the tissue tropism of the virus is reflected using this recombinant protein.
Amino acid alignment of the mature protein sequence of the receptor-binding domain (RBD) of M41 and a comparable size fragment of the QX spike displayed a sequence identity of 73.6% ( Fig. 2A), with the highest sequence diversity between amino acids 37 to 60 and 98 to 115. These regions include the previously described hypervariable regions (HVRs) (highlighted in gray) of M41-S1 (22). Before studying whether sequence diversity between the RBDs of M41 and QX contributes to the Affinity of M41 spike proteins for the known ligand (Neu5Ac␣2-3Gal␤1-3GlcNAc) in solid-phase ELISA. At all protein amounts, a significant difference of at least P Ͻ 0.01 was observed between M41-RBD, M41-S1, M41-ED, and M41-CTD and TCoV-S1, which served as a negative control tested in two-way ANOVA. ELISA was performed in triplicate where average and standard deviations are shown.
reported broader tropism of QX in vivo, we first determined if the potential RBD of QX behaved like that of M41 ( Fig. 1) and that it contains a receptor-binding domain (15). Both proteins were produced as soluble recombinant protein in mammalian cells and analyzed on Western blots after purification. Before loading, a fraction was pretreated with peptide-N-glycosidase F (PNGase F) to remove posttranslational glycosylation. QXand M41-RBD migrated comparably at around 55 kDa (including glycosylation) and had a backbone of around 32 kDa as expected after PNGase F treatment (Fig. 2B). Circular dichroism (CD) spectroscopy was used to assess similarities in secondary structure between M41-and QX-RBD. Spectra at all temperatures followed the same curve, and both proteins had similar broad melting curves, indicating that both proteins are equally stable (data not shown). Subsequent secondary structure calculations using DichroWeb (23) presented that M41-and QX-RBD contain 29 and 25% ␣-helix, 16 and 17% ␤-strands, and 55 and 58% random structures, respectively (Fig. 2C). Finally, we confirmed that the QX-RBD was biologically active by applying it to chicken trachea tissue slides in protein histochemistry. We observed clear binding to the ciliated lining of epithelial cells and structures present in the kidney (Fig. 2D), indicating that QX-RBD, like M41-RBD, contains a receptor-binding site.
QX-RBD shows a broader tissue tropism than that of M41-RBD. Next, we used M41-and QX-RBDs to study the distribution of host attachment factors across chicken tissues. To this end, we allowed both proteins to bind to tissue microarray slides containing 28 different chicken tissues (24). Binding of M41-RBD was primarily found on the ciliated lining of the epithelium of the proximal and distal trachea ( Fig. 1), but additional staining was observed in the epithelial lining of the colon, cecal tonsil, ureter, oviduct, and conjunctiva (Table 1). QX-RBD bound to the same tissues as M41-RBD, but additional binding was observed in gizzard, ileum, and cloaca of the digestive tract, as well as liver and kidneys (Table 1 and Fig. 2D), reflecting that observed in vivo for replication of both genotypes. Detailed analysis of staining present in the kidney showed that binding of QX-RBD was restricted to the parietal epithelium of Bowman's capsule in the glomerulus (Fig. 2D). No binding to the glomeruli was observed when using M41-RBD in three independent experiments using different protein batches. Taken together, QX-RBD shows a markedly broader binding profile than that of M41-RBD, which is in line with the reported broader tissue tropism in vivo (2).

QX-RBD binds to sialic acids on chicken tissues.
To investigate whether the expanded tropism of QX-RBD can be explained by binding with similar specificity, but higher affinity, to the previously identified M41 receptor (19), we preincubated both RBD proteins with the synthetic Neu5Ac␣2-3Gal␤1-3GlcNAc before applying them to trachea and kidney tissue slides. As expected, binding of M41-RBD to the trachea was completely prevented (Fig. 3A, middle column) in the presence of the synthetic M41 ligand. In contrast, QX-RBD still showed strong binding to the ciliated epithelium of the trachea and glomeruli of the kidney. To confirm the loss of binding of QX-RBD to Neu5Ac␣2-3Gal␤1-3GlcNAc, a solid-phase ELISA was performed, in which Neu5Ac␣2-3Gal␤1-3GlcNAc was coated. As Neu5Ac␣2-3Gal␤1-3GlcNAc (middle column) or pretreatment of tissues with Arthrobacter ureafaciens neuraminidase (right column). (B) Affinity of RBD proteins for Neu5Ac␣2-3Gal␤1-3GlcNAc in ELISA. **, P Ͻ 0.01; ****, P Ͻ 0.001 tested in two-way ANOVA. TCoV-S1 was used as a negative control in equal molar amounts. ELISA was performed in triplicate with all proteins; average is shown with standard deviations. expected, no binding of QX-RBD to this particular glycan was observed at any of the protein concentrations, which is comparable to that of the negative control TCoV-S1 (only binding longer branched galactose-terminated glycans [25]), while M41-RBD bound to Neu5Ac␣2-3Gal␤1-3GlcNAc in a concentration-dependent manner (Fig. 3B).
To reveal whether QX-RBD exclusively depends on sialic acids, trachea and kidney tissue slides were pretreated with Arthrobacter ureafaciens neuraminidase before applying M41-and QX-RBD. Removal of sialic acids from trachea and kidney tissue completely prevented binding of both RBD proteins (Fig. 3A, right column), indicating that QX-RBD binding is dependent on the presence of sialic acids on host tissues.
M41-RBD gains kidney binding upon MLQ107-109KIP mutation. To gain indepth knowledge on the interaction of the IBV RBD proteins and chicken tissue, we set out to determine the critical amino acids of viral spike proteins involved in binding to these glycan receptors, thereby leading to the ability to bind to kidney tissue. Chimeric RBD proteins were generated by dividing each wild-type RBD into three domains and mixing them to get six different combinations (schematic representations in Fig. 4A). These chimeras were then applied to trachea and kidney tissue slides.  (Fig. 4B). Like wild-type RBDs, binding of all chimeric proteins was dependent on the presence of sialic acids, as pretreatment of host tissues with AUNA abrogated binding (data not shown). M-M-Q, Q-M-M, and Q-M-Q proteins had reduced affinity for Neu5Ac␣2-3Gal␤1-3GlcNAc (Fig. 4C), potentially explaining the reduced staining of these proteins to trachea tissue (Fig. 4B). None of the RBD proteins containing the middle QX sequence (Q-Q-M, M-Q-Q, and M-Q-M) had affinity for this glycan in the ELISA as expected based on tissue staining ( Fig. 4B and C), which is in line with the hypothesis these proteins are dependent on binding to the QX receptor instead of the known M41 receptor. These results indicate that the receptor-binding site responsible for recognition of the QX receptor is determined by amino acids 99 to 159 of the spike.
To ultimately determine the critical residues of the RBD for the interaction with chicken kidney tissue, additional chimeric proteins were produced and used in protein histochemistry. We exchanged two triplets (highlighted in dark green in Fig. 2A) of amino acids in HVR 2 (amino acids 99 to 115 of M41), either alone or in combination, that had the high diversity in amino acid characteristics (schematic representations in Fig. 5A). Introduction of the M41 sequence in the QX-RBD protein, SGS100 -102Y (QX-Y) and KIP110 -112MLQ (QX-MLQ) and their combination (QX-Y-MLQ), all resulted in a loss of binding to trachea and kidney tissues (Fig. 5B, right). In contrast, introduction of MLQ107-109KIP into M41-RBD (M41-KIP) resulted in binding to glomeruli in kidney tissue, both in a wild-type background and in the Y99SGS (M41-KIP-SGS) mutant (Fig. 5B, left). In the ELISA, both M41-SGS and M41-KIP demonstrated a decreased affinity for alpha-2,3-linked sialic acids compared to that of M41-RBD, while introduction of both triplets SGS and KIP (M41-SGS-KIP) completely abolished binding to this glycan (Fig. 5C). Taken together, these results suggest that a receptor-binding site critical to establish kidney binding requires amino acids KIP at position 107 to 109 in M41-RBD.
Receptor-binding site of the QX-specific receptor differs from that proposed for M41. Finally, we modelled QX-RBD based on a structural overlay with the recently resolved cryo-EM structure of the M41 spike (20) and focused on the amino acids allowing kidney binding. The overall structure of both proteins is comparable (Fig. 6A, green ribbon, M41; blue ribbon, QX); however, the loop consisting of HVR 2 is slightly larger in QX-RBD as expected, as there are two additional amino acids present (Fig. 6A, SGS100 -102 for QX-RBD versus Y99 in M41-RBD). Interestingly, this loop was predicted to be involved in sugar binding (20), which we showed to be true for QX-RBD but not for M41-RBD. In detail, the tyrosine (Y99) in the M41 structure (Fig. 6A, beige) occupies more space than serine (S in QX) and can be seen reaching toward a neighboring loop. Furthermore, the 110 -112KIP sequence identified in QX-RBD (Fig. 6A, dark blue) places a positive charge at the protein surface, which is not present in 107-109MLQ in M41-RBD (Fig. 6A, light blue). Previous in silico docking analysis performed with potential alpha-2,3-linked ligands to the M41-RBD protein identified amino acids S87, N144, and T162 to potentially be involved in receptor binding (17). When we highlighted these amino acids predicted to be involved in binding to alpha-2,3-linked sialic acids (Fig. 6B, red spheres)   the different ligands recognized by M41 and QX are on different sides of the protein (Fig.  6B). Furthermore, when these amino acids were highlighted in the full cryo-EM resolved structures of M41 (Fig. 6C) and QX (Fig. 6D), it clearly shows that the potential ligandbinding site of M41 is at a different location compared to the QX ligand-binding site ( Fig.  6C and D).
In conclusion, we demonstrate that IBV-QX recognizes a sialylated glycan receptor present on chicken tissues that differs from that recognized by M41 and that this binding is likely required for the extended in vivo tissue tropism of the virus.

DISCUSSION
In this study, we reveal that nephropathogenic IBV-QX shows expanded binding tropism based on interactions with sialic acid(s) on chicken tissues that differs from the  (20). S2 is in dark gray for all monomers. S1 is in light gray with one S1 monomer colored bright green for the RBD domain and pale green for the CTD. Amino acids involved in ligand binding are highlighted as follows: yellow is 99Y (100 to 102 SGS in QX), dark blue is 107 to 109 MLQ (110 to 112 KIP in QX), and red is S87, N144, and T162. (D) Modeled QX spike based on PDB accession number 6cv0, colors as indicated in panel C, except the S1 of QX is blue, and the RBD is bright blue. Representations on right of panels C and D are structures turned 90 degrees toward the viewer. All representations were made using PyMOL viewer.
To elucidate the specific ligand used by QX-RBD, we performed several binding studies using previously developed glycan arrays (30,31) containing multiple linear and branched glycans capped without or with alpha-2,3-linked sialic acids or alpha-2,6linked sialic acids. Unfortunately, no binding was observed using our RBD proteins. This may be explained by the usage of RBD proteins instead of the full S1, as used previously (19), or by the composition and fine structure of the glycans present in both arrays. On the arrays used, most glycans contain the linkage found in mammals (Gal␤1,4GlcNAc), while the minority contain a Gal␤1,3GlcNAc linkage. The exact nature of the receptor recognized by IBV-QX could be a more complex glycan containing a Gal␤1,3GlcNAc that is scarcely populated on glycan arrays.
Comparison of the spikes of various IBV strains with reported nephropathogenicity, including IBV clade GI-14 (including strain B1648 [3]) and clade GI-13 (including strain 793B [3]), shows that only nephropathogenic IBV clades contain an amino acid triplicate at position 100 to 102, whereas in IBV-Mass genotypes, 99Y/H is expressed, thereby shortening HVR 2 with two amino acids. Sequence alignment of this amino acid triplicate (100 to 102 in QX-RBD) varies in nephropathogenic IBV genotypes from SGS/SGT for clade GI-19 (IBV-QX), NQQ/SQQ for clade GI-13 (IBV-793B), and SGA for clade GI-14 (IBV-B1648) at that position. Furthermore, the amino acid triplet 110 to 112 KIP is not conserved across IBV genotypes. In these genotypes, amino acid triplets LIQ for B1648 and MIP for 793B are present, which are sequence combinations of amino acids found in Mass (clade GI-1) and QX (clade . In terms of hydrophobicity and size, amino acid triplet MLQ (M41) is very similar to LIQ (B1648), whereas the proline (P) in KIP (QX) and MIP (793B) reduces the flexibility of the loop.
Structural analysis of the RBD of IBV suggests that the receptor-binding sites for M41 and QX are positioned at different sides of the RBD (Fig. 6B through D). Previous in silico predictions of the interaction with alpha-2,3-linked sialic acid ligands in M41-RBD pointed toward three amino acids, S87, N144, and T162, which are in close proximity to four essential N-glycosylation sites (N33, N59, N85, and N160) (17). Although the amino acid sequence of S87, N144, and T162 is conserved between M41 and QX, one of the essential N-glycosylation sites is at a different position (N59 in M41 and N58 in QX). This may result in a different conformation of the ligand-binding site, thereby preventing QX-RBD wild-type binding to the M41 ligand, which was supported by experimental evidence using the chimeric M41-RBD protein where this glycosylation site was replaced, resulting in loss of binding to trachea tissue (data not shown). Furthermore, in the publication where the cryo-EM structure of M41 was resolved, the loop consisting of amino acids present in HVR 2 of the spike was proposed to be required for receptor binding (20). Our data points toward involvement of the unglycosylated loop containing HVR 2 for recognition of the QX glycan ligand but not the M41 ligand. Furthermore, the cryo-EM structure of the M41-CTD predicts other putative receptor-binding motif loops in M41 spike (20). In Fig. 1, we demonstrated that no binding to trachea and kidney tissue was observed using our recombinantly expressed M41-CTD, in contrast to their published results. As binding of QX-RBD reflects the tissue tropism of QX-infected birds, we speculate whether these loops (in the CTD) are necessary for initial receptor recognition and are involved in QX infection.
In conclusion, we demonstrated that IBV-QX binding to chicken trachea and kidney tissue is dependent on a sialylated glycan receptor and that amino acids in HVR 2 of the QX-RBD are critical for this receptor-binding profile. This knowledge adds to our understanding of differences in tissue tropism between IBV strains in vivo and may contribute to designing new antivirals to prevent coronavirus infections in the field.

MATERIALS AND METHODS
Construction of the expression plasmids. The expression plasmids containing the codonoptimized M41-ED (amino acids 19 to 1091 [21]), M41-S1 (amino acids 19 to 532 [19]), M41-RBD (amino acids 19 to 272 [15]), and M41-CTD (amino acids 273 to 532 [15]; GenBank accession number AY851295) sequences followed by a trimerization domain (GCN4) and strep-tag (ST) were described previously (15). The codon-optimized sequence of QX-RBD (amino acids 19 to 275; GenBank accession number AFJ11176), containing upstream NheI and downstream PacI restriction sites, was obtained from Gen-Script and cloned into the pCD5 expression vector by restriction digestion as previously described (19). Fragments to generate chimeric RBD proteins were created by splice overlap extension PCR using the primers in Table 2 and cloned into the pJET vector (Thermo Scientific, USA). The sequences were verified by automated nucleotide sequencing (Macrogen, The Netherlands) before cloning each fragment into the pCD5 expression vector. Mutations up to 9 nucleotides (nt) were introduced by site-directed mutagenesis using the primers listed in Table 2, and the sequences were subsequently verified by automated nucleotide sequencing (Macrogen, The Netherlands).
Production of recombinant proteins. Recombinant RBD proteins were produced in human embryonic kidney (HEK293T) cells. In short, cells were transfected with pCD5 expression vectors using polyethylenimine (PEI) at a 1:12 (wt/wt) ratio. Cell culture supernatants were harvested after 6 days. The recombinant proteins were purified using Strep-Tactin Sepharose beads as previously described (19). Proteins were pretreated (where indicated) with PNGase F (New England Biolabs, USA) according to the manufacturer's protocol before analysis by Western blotting using Strep-Tactin horseradish peroxidase (HRP) antibody (IBA, Germany). Circular dichroism. Recombinant IBV RBD proteins were exchanged into buffer containing 10 mM sodium phosphate, pH 7.75, and diluted to 0.06 mg/ml. CD spectra were collected on a Jasco J-810 spectropolarimeter with a Peltier thermostatted fluorescence temperature controller module by accumulating 4 scans from 285 to 190 nm with a scanning speed of 10 nm/min, digital integrated time of 1 s, bandwidth of 1 nm, and standard sensitivity of 25°C. A thermal melt was done from 25°C to 95°C with a ramp rate of 1°C per minute. A full CD scan was collected at 95°C. After lowering the temperature to 25°C, the protein was allowed to refold for 20 min at 25°C, and a third CD scan was taken at 25°C to measure recovery. Secondary structure calculations for the CD data collected at 25°C before the thermal melt were processed by DichroWeb (23) using the CDSSTR (32), Selcon3 (33), and ContiLL (34) algorithms with protein reference set 7. Results from the 3 algorithms were averaged and plotted in Fig. 2C.