The non-typeable Haemophilus influenzae major adhesin Hia is a dual function lectin that binds to human-specific respiratory tract sialic acid glycan receptors

NTHi is a human-adapted pathogen that colonises the human respiratory tract. Strains of NTHi express multiple adhesins, however there is a unique, mutually exclusive relationship between the major adhesins Hia and HMW1/2. Approximately 25% of NTHi strains express Hia, a phase-variable autotransporter protein, and which has a critical role in colonisation of the host nasopharynx. The remaining 75% of strains express HMW1/2. Previous work has shown that the HMW1 and HMW2 proteins mediate binding to 2,3- and 2,6-linked sialic acid glycans found in the human respiratory tract. Here we show that that the high affinity binding domain of Hia, binding domain 1 (BD1) is responsible for binding to α2,6-sialyllactosamine glycans. BD1 is highly specific for glycans that incorporate the form of sialic acid expressed by humans, N-acetylneuraminic acid (Neu5Ac). We further show that Hia has lower affinity binding activity for 2,3-linked sialic acid and that this binding activity is mediated via a distinct domain. Thus, Hia with its dual binding activities functionally mimics the combined activities of the HMW1 and 2 adhesins. In addition, we show that Hia has a role in biofilm formation by strains of NTHi that express the adhesin. Knowledge of the binding affinity of a major NTHi adhesin, and putative vaccine candidate, will direct and inform development of future vaccines and therapeutic strategies for this important pathogen. Importance Host-adapted bacterial pathogens like NTHi have evolved specific mechanisms to colonize their restricted host niche. Relatively few of the adhesins expressed by NTHi have been characterized as regards their binding affinity at the molecular level. In this work we show that the major NTHi adhesin, Hia, preferentially binds to Neu5Ac-α2,6-sialyllactosamine, the form of sialic acid expressed in humans. The receptors targeted by Hia in the human airway mirror those targeted by influenza A virus and indicates the broad importance of sialic acid glycans as receptors for airway pathogens.


Introduction 53
Non-typeable Haemophilus influenzae (NTHi) is a human-adapted pathogen, responsible for 54 multiple acute and chronic infections of the respiratory tract, including otitis media (OM) (1) 55 community acquired pneumonia (2), and chronic obstructive pulmonary disease (COPD) 56 exacerbations (3). Each year there are 31 million new cases of the most severe form of OM, 57 chronic suppurative OM, are diagnosed (4), 60% of whom suffer an associated hearing loss. 58 Globally, there are over 700 million cases of acute OM every year (4); in the USA alone, 59 each year there are ~25 million episodes of acute OM, >13 million antibiotic prescriptions, 60 and public health costs estimated at $3-$5 billion (5, 6). According to WHO estimates,61 approximately 65 million people have moderate to severe COPD. Over 3 million people died 62 of COPD in 2005, which corresponded to 5% of all deaths globally (7). Invasive disease 63 caused by NTHi has increased significantly in recent years, in part due to vaccines against 64 Haemophilus influenzae type b, and Streptococcus pneumoniae (8). At present, there is no 65 effective vaccine against NTHi. 66 NTHi is commonly carried the human nasopharynx asyptomatically. Many bacterial 67 pathogens express outer-surface proteins that target specific host molecules to allow them to 68 adhere to and persist in specific niches in the host. Examples of bacterial adhesins 69 recognising particular host proteins include the type IV pilus of Neisseria gonorrhoeae, 70 which recognises host integrins (9); the type IV pilus of NTHi, which recognizes ICAM1 71 (10); the curli pili of Salmonella enterica, which binds host TLR2 receptors; and the FimH 72 protein of uropathogenic Escherichia coli, which binds to mannosylated glycoproteins (11). 73 Many bacteria also express virulence factors that belong to the auto-transporter protein 74 family. These proteins have a diverse array of functions including adhesion to host surfaces 75 (12). Auto-transporter proteins are characterised by a large barrel-like C-terminal domain 76 which inserts into the outer-membrane, forming a pore through which the N-terminal effector 77 portion passes to reach the extracellular environment (13,14). NTHi express many 78 autotransporter proteins (15) that fulfil a variety of roles in NTHi pathobiology. One of these 79 autotransporters, Hia, is an adhesin that is expressed by approximately 25% of NTHi strains 80 (16). The remaining ~75% of NTHi strains express the HMW1/2 proteins (17), which have 81 previously been demonstrated to be involved in adhesion of NTHi to human cells (18). It is 82 unclear why strains encode genes for Hia or HMW but never both. The HMW1 protein binds 83 to host cell glycans as cellular receptors, specifically α 2,3-sialyllactosamine (2-3 SLN) (19). 84 We recently demonstrated that HMW2, which is ~65% identical to HMW1, binds the related 85 glycan α 2,6-sialyllactosamine (2-6 SLN), with high specificity for 2-6 SLN containing N-86 acetylneuraminic acid (Neu5Ac), the form of sialic acid expressed by humans (20). 87 Intriguingly, 2-3 SLN is found mainly in the lower human respiratory tract, whereas 2-6 SLN 88 is found throughout the entire respiratory tract, but predominates in the upper airway (21). It 89 has previously been demonstrated that Hia is required for adherence to Chang epithelial cells 90 (22), and we have demonstrated that Hia is required for colonisation of the host nasopharynx 91 (23). However, the cellular receptor for Hia is currently unknown. We hypothesized that Hia 92 may also recognize host-specific glycans found in the human respiratory tract. In the current 93 study we present an investigation to identify and characterized the Hia cellular receptor. 94 95

96
Hia is a lectin that recognizes Neu5Ac-α2,6-lactosamine (2-6 SLN-Ac) with high affinity. 97 In order to determine whether Hia had glycan binding activity, we cloned and over-expressed 98 Hia from NTHi strain R2866, in E. coli BL21. Heterologous over-expression of Hia in E. coli 99 was used previously to investigate Hia binding activity (22). Hia over-expression was 100 confirmed by Western blot and whole cell ELISA (Supplementary Figure 1). The glycan 101 binding ability of E. coli strain BL21 cells expressing Hia (BL21-Hia) was compared to wild 102 type BL21 cells, using glycan array analysis. The background binding of BL21 only was 103 subtracted from BL21-Hia in order to deduce the glycans bound in an Hia-dependent manner. 104 A subset of the identified glycans were characterised for their binding affinity to BL21-Hia 105 using surface plasmon resonance (SPR; Table 1). These studies demonstrated that Hia bound 106 to a number of sialylated glycans, with the greatest affinity for Neu5Ac-α2,6-lactosamine (2-107 6 SLN-Ac), with a disassociation constant (K D ) of 185 nM. A comparison of the binding 108 affinity of Hia to matched glycan pairs containing either a terminal N-acetylneuraminic acid 109 (Neu5Ac; the only form expressed in humans) or N-glycolylneuraminic acid (Neu5Gc; which 110 is expressed in most mammals), showed that Hia preferentially binds to structures containing 111 a terminal Neu5Ac (Table 1), with a ~7-fold preference for 2,6-SLN-Ac over Neu5Gc-α2,6-112 lactosamine (2,6 SLN-Gc) (185 nM vs 1.39 μ M; Table 1). Whilst some binding to 2,3 SLN-113 Ac (2.03 μ M; Table 1) was observed, this occurred with approximately 11-fold lower affinity 114 than for 2-6 SLN-Ac (185 nM). 115

116
Modelling shows key interactions between BD1 residues D618 and A620 and the Neu5Ac 117 moiety of 2-6 SLN-Ac 118 The Hia protein has previously been shown to contain high-and low-affinity host cell 119 binding domains (BD) termed BD1 and BD2 respectively (22,24). BD1 and BD2 are 120 proposed to bind a common, but unknown cellular receptor (24). Hia BD1 consists of amino 121 acids 541-714 inclusive (22), with residues in the BD1 shown to be essential for binding to 122 Chang epithelial cells when Hia is expressed in E. coli (22). To determine the molecular basis 123 of the interactions between Hia BD1 and 2-6 SLN-Ac, we carried out molecular docking 124 studies using the previously published Hia BD1 structure (22). All docking structures of 2-6 125 SLN-Ac with HiaBD1 indicated interaction of the ligand at the interface of chain A and chain 126 C of HiaBD1. Figure 1 shows a bound structure of 2-6 SLN-Ac that represents a sialic acid-127 specific binding mode with the negatively charged carboxylate group of the Neu5Ac residue 128 engaging in strong electrostatic interaction with R674. The glycerol side chain of the sialic 129 acid moiety of 2-6 SLN-Ac plays an important role as it engages in hydrogen bonds with 130 D618 and A620. Importantly, the high flexibility of the α (2-6)-linkage of 2-6 SLN-Ac allows 131 the coordination of the lactosamine disaccharide moiety. In addition, our docking studies 132 indicate that residue R674 is involved in coordinating 2-6 SLN-Ac in all potential 25 docked 133 conformations. 134 135 Hia BD1 is the site of high affinity interactions with the cellular receptor 2-6 SLN-Ac. 136 Using purified Hia BD1 (aa 514-714 inclusive) (22) we investigated BD1 binding specificity 137 using SPR. Table 1 shows that Hia BD1 binds with high affinity and specificity to 2-6 SLN-138 Ac, with a K D of 64.9 nM ± 5.6. This value is in a similar range to the affinity we observe 139 with full-length Hia (185 nM ± 59.9). Hia BD1 interacts with 2-6 SLN-Gc with ~1000 lower 140 affinity (61.28 μ M ± 9.1; see Table 1) than with 2-6 SLN-Ac. In order to determine the 141 specific region of BD1 responsible for the interaction with 2-6 SLN-Ac, we constructed a 142 peptide library of BD1 aa 541-714, consisting of peptides of 15 amino acids in length, 143 overlapping consecutive peptides by 10 aa each (for example, peptide one consisted of 144 residues 541-555; peptide two of residues 546-560, etc; Table 2). We used these peptides to 145 block the interaction between BL21-Hia and 2-6 SLN-Ac using an SPR competition assay. 146 Using this methodology, we show that a peptide comprised of twenty amino acid residues 147 containing both D618 and A620 (p16+17; residues 616-635) blocks 100% of the interaction 148 between BL21-Hia and 2-6 SLN-Ac ( and A620 were previously shown to be key for binding to host cells, as when these residues 154 were mutated (D618K and A620R), binding was lost (22). Our blocking studies provide 155 strong evidence that additional residues, and likely secondary structure around these residues 156 that can only form in the 20mer p16+17, mediate direct interaction between 2-6 SLN-Ac and 157 Hia, leading to high-affinity binding. 158 In order to confirm these findings, we generated recombinant Hia with the single mutations 159 D618K and A620R, and a double mutant of Hia lacking both of these residues 160 (D618K/A620R double). SPR analysis was used to compare the binding of this panel of Hia 161 mutants with wild type Hia and BD1, using the same subset of glycans (see Table 3). These 162 findings demonstrated that the A620R Hia mutant and the D618K/A620R Hia double mutant 163 (all located in BD1) completely lose the ability to bind 2-6 SLN-Ac, while still maintaining 164 binding to 2-3 SLN-Ac. Collectively, these data demonstrated that the binding site of 2-3 165 SLN-Ac is not BD1, and confirmed the role of BD1 in binding specificity to 2-6 SLN-Ac. 166

Hia is involved in interactions between NTHi and epithelial cells 168
In order to demonstrate a biological role for Hia in attachment of NTHi to host epithelium, 169 we performed adherence assays using Chang epithelial cells. Prior to carrying out these 170 adherence assays, we confirmed 2-6 SLN was localized on the surface of these cells using 171 Dylight 649 Conjugated SNA, a lectin specific for 2-6 SLN ( Figure 2A). Following treatment 172 with sialidase to remove sialylated glycans, 2-6 -SLN was no longer detected on the cell 173 surface by SNA (Figure 2A). Using NTHi strain R2866 that expressed Hia (wild type R2866; 174 R2866 hia+), and an isogenic mutant lacking Hia (R2866 hia::tet), we showed that the ability 175 of NTHi to adhere to Chang cells decreased when NTHi lacked Hia. Wild type R2866 is 176 unable to bind Chang cells treated with sialidase, which removes sialylated glycan structures 177 ( Figure 2B). 178

179
Residues D618 and A620 are critical to the interaction of Hia with Chang cells 180 In order to determine the contribution of the key 2-6 SLN-Ac interacting residues (D618, 181 A620), and residue R674, indicated as important from our modelling studies, we carried out 182 adherence assays using a Chang epithelial cell model (23) Figure 3) compared to control cells that did not express 186 Hia (empty BL21; 1.26% adherence; P = 0.0007). BL21 that express the Hia D618K/A620R 187 double mutant exhibited an approximately 4.5-fold decrease in relative adherence (3.21% 188 adherence; P = 0.002) compared to BL21 that express wt Hia. BL21 that expressed Hia 189 R674A, showed an approximate 2-fold decrease in relative adherence compared to cells that 190 express wt Hia (8.4% adherence), but this was not statistically significant compared to cells 191 expressing wt Hia (P = 0.06). These data indicated that the interaction between Hia and 2-6 192 SLN-Ac is critical to bacterial interactions with epithelial cells, and demonstrate the key 193 contribution of residues D618 and A620 of Hia BD1 in mediating this interaction. 194

Expression of Hia in NTHi results in larger more robust biofilms by strains where the hia 196
gene is present 197 The role of Hia in biofilm formation by two NTHi strains (R2866 and strain 11; both 198 encoding the hia gene) was tested using our well defined static biofilm model for NTHi (25). 199 Biofilms were formed for 24 hours at 37°C. Both strain R2866 and strain 11 formed much 200 larger biofilms when Hia was expressed (hia+) compared to when it was absent, as assessed 1 0 by confocal microscopy ( Figure 4). NTHi that expressed Hia (hia+) formed biofilms with 202 significantly more biomass (P = <0.0001 strain R2866 Figure 4A; P = <0.01 strain 11; Figure  203 4B) and were significantly thicker (P = <0.0001 strain R2866 Figure 4A; P = <0.05 strain 11; 204 Figure 4B) compared to those formed by strains that did not express Hia (hia::tet). 205 Descriptively, biofilms formed by NTHi that expressed Hia were significantly denser, and 206 had a lawn-like architecture, compared to those formed by the respective isogenic mutant 207 strain that did not express Hia. Biofilms of the two hia::tet isogenic mutant strains were 208 significantly rougher (e.g. had a greater difference in overall surface height and topography) 209 substantially taller and more-dense than those formed by NTHi that do not express Hia. 219 These results indicated that the Hia adhesin was a critical determinant of biofilm structure 220 and organization in these strains, possibly due to increased inter-bacterial associations. 221 222 Discussion 223 In this work, we have demonstrated that the NTHi adhesin Hia is a lectin, with high 224 specificity for host-specific glycans. Hia mediates high-affinity binding to 2-6 SLN-Ac. 225 Molecular modelling studies using the crystal structure of Hia BD1 (22) in complex with 2-6 1 1 SLN-Ac showed that Hia residues D618 and A620, and to some extent R674, were critical to 227 this interaction. We experimentally confirmed our modelling using a diverse and 228 comprehensive array of complementary in vitro studies. Using a combination of E. coli 229 expressing Hia, purified HiaBD1, and a peptide library derived from BD1, we determined 230 that residues D618 and A620 of Hia are required for the high affinity interaction between Hia 231 and 2-6 SLN-Ac. Interestingly, our SPR data also confirmed that Hia recognises 2-3 SLN-232 Ac, but this interaction was approximately 10-fold lower than that for 2-6 SLN-Ac. However, 233 using the HiaBD1 protein, we confirmed that the interaction of Hia with 2-3 SLN-Ac is not 234 mediated by BD1, which is consistent with previous findings which proposed that Hia 235 contains two binding domains (22). Therefore, contrary to previous work which stated that 236 BD1 and BD2 bind the same ligand (24), we have shown that the two binding domains of 237 Hia interact with distinct ligands; BD1 with 2-6 SLN-Ac and BD2 with 2-3 SLN-Ac. 238 Moreover, in both cases, the preference is for the form of sialic acid expressed by humans 239 (Neu5Ac). This offers an insight into the evolution of NTHi as a human-specific pathogen: 240 although Neu5Gc and Neu5Ac (the precursor to Neu5Gc) can be expressed by most 241 mammals, humans only make Neu5Ac linked glycans, due to a mutation in the CMAH gene 242 responsible for the conversion of Neu5Ac to Neu5Gc (26). As it appears that Hia 243 preferentially binds Neu5Ac linked glycans over Neu5Gc linked glycans, this finding 244 strongly suggests that Hia has evolved to preferentially bind glycans most likely to be present 245 in its human host. Preference for the Neu5Ac form of sialic acid has also been observed in 246 the utilisation of Neu5Ac for macromolecular biosynthesis of bacterial cell surface glycans in 247 NTHi (27), which further supported the central role of sialic acid in the adaptation of NTHi to 248 its human host. 249 NTHi strains that do not possess the gene encoding hia instead encode genes for and express 250 the adhesins HMW1 and HMW2 (18). Previous work has demonstrated that NTHi strains 251 either encode genes for Hia or HMW1/2, but never both, with approximately 75% strains 252 expressing HMW1/HMW2, and the remaining 25% expressing Hia. We have recently 253 demonstrated that HMW2 preferentially binds 2-6 SLN-Ac (20), whereas HMW1 has a 254 preference for 2-3 SLN structures (19). However, HMW1 showed no preference for either 255 Neu5Ac or Neu5Gc containing structures, and had a much lower affinity than HMW2 for 2-6 256 SLN-Ac (20). Therefore two distinct NTHi adhesins, Hia and HMW1/2, that show a discrete 257 lineage distribution in the NTHi population, have evolved to bind the same subset of glycans: 258 Hia binds 2-6 SLN-Ac preferentially over 2-3 SLN-Ac; HMW2 specifically binds 2-6 SLN-259 Ac; HMW1 binds a broader range of 2-3 and 2-6 linked glycan structures compared to 260 HMW2, but with lower overall affinity ( Using two diverse strains of NTHi, we showed that Hia is a critical determinant of biofilm 295 development and structure and that the potential to block Hia function through knowledge of 296 its specific binding affinities could play a key role in targeting biofilm formation during 297 disease caused by NTHi. 298 To summarize, we have provided an in-depth characterization of the binding affinity 299 of the NTHi adhesin Hia, by determining the major human cellular receptors it has evolved to 300 bind and by demonstrating the molecular basis of these interactions. We also demonstrate 301 that Hia has a role in biofilm formation by NTHi, and therefore likely contributes to 302 antibiotic resistance and chronicity by this mechanism. Knowledge of the factors required by 303 NTHi to colonise and cause disease will be key to developing both vaccines and treatments 304 against this organism. Our demonstration that the major NTHi adhesins HMW1 and HMW2 305 bind the same host glycans as Hia (20), and that these adhesins are expressed by nearly 100% 306 of NTHi strains is a key step towards the development of a rationally designed vaccine 307 against NTHi, and to the production of novel treatments against this pathogen. 308 309

Generation of a hia knockout mutant in NTHi strains R2866 and 11 320
A region of NTHi R2866 chromosome containing the hia promoter and the ATG start and 321 5′ region of the gene were generated by PCR using primer pair hia-UP-F / hia-UP-R, and 322 cloned into pGEM Teasy according to manufacturer's instructions (Promega) to generate 323 plasmid vector Teasy::hiaUP. Inverse PCR was used to linearise this vector at the hia start 324 codon using primers hia-INV-F / hia-INV-R. A tetracycline resistance cassette, encoding 325 tetM, was generated from plasmid vector pGEM-TetM(B) using M13F and M13R primers. 326 This was cloned into the linearised Teasy::hiaUP vector so the gene was in the same 327 orientation as the hia gene, and orientation confirmed using PCR and sequencing. This vector 328 was designated Teasy::hiaUP::TetM. Following linearization with NgoMIV (New England 329 Biolabs), DNA was transformed into NTHi strains R2866 and strain 11 using the MIV 330 method (46). Transformants were selected on BHI media containing 5 µg tetracycline /mL, 331 and positive colonies confirmed by sequencing and Western blotting using an anti-Hia 332 monoclonal antibody 1F4 (47). Strains were designated as R2866 or strain 11 hia::tet. 333 334

Cloning and over-expression of full length Hia in E. coli 335
Primers HiaFULL-F and HiaFULL-R (Supplementary Table 1) were used to amplify full 336 length wild-type hia (R2866_0725) including the signal sequence (residues 1-49) from 337 genomic DNA prepared from NTHi strain R2866. PCR was carried out using KOD hot-start 338 polymerase (EMD Millipore) according to manufacturer's instructions. Following digestion 339 with BspHI and XhoI (NEB) and clean up, DNA was cloned into pET15b digested with NcoI 340 and XhoI. the resulting plasmid was designated pET15b::Hia. Following confirmation of 341 correct clones by sequencing, over-expression was carried out in E. coli BL21 following by 342 inducing cells with 0.5 mM IPTG overnight at 37°C with 200 rpm shaking. Over-expression 343 was confirmed by Western blot as previously described (23)

Generation of Hia point mutants for over-expression 349
Inverse PCR was carried out using primer pairs designed to introduce point mutations as 350 previously described and used here to abrogate binding of E. coli expressing Hia to Chang 1 6 cells (22). D618K, A620A and a 618/620 double mutant were generated using specific 352 forward primers Hia-D618K-F, Hia-A620R-F, or Hia-618/620-double-F and common reverse 353 primer Hia-618/620-R. A R674A mutant was generated using primer pair Hia-R674A-F and 354 Hia-R674A-R. All inverse PCR reactions were carried out using KOD hot-start polymerase 355 (EMD Millipore) according to manufacturer's instructions, and a plasmid mini-prep (Qiagen) 356 of pET15b::Hia as template. All primer sequences are listed in Supplementary Table 1. 357 Clones were sequenced using primers either side of the point mutation Hia-screen-F and Hia-358 screen-R using BigDye 3.1 according to manufacturer's instructions (Thermo Fisher), and 359 sequenced at Australian Genome Analysis Facility (AGRF, Brisbane, Australia). Over-360 expression was carried out as described above for Hia wild-type, and cell surface localization 361 confirmed using whole cell ELISA as above. 362 363

Over-expression and purification of Hia BD1 364
Primers to clone Hia binding domain 1 (BD1; amino acid residues 540-714) were designed 365 based on those from (22). HiaBD1-F and HiaBD1-R were used to amplify BD1 from NTHi 366 strain R2866 genomic DNA using KOD hot-start polymerase (EMD Millipore) according to 367 manufacturer's instructions. Following digestion with NdeI and BamHI (NEB) and clean up, 368 DNA was cloned into pET15b digested with the same enzymes. This strategy would clone 369 the gene in frame with an N-terminal 6xHis tag for purification. The resulting plasmid was 370 designated pET15b::HiaBD1. Over-expression was carried out in E. coli BL21 following by 371 inducing cells with 0.5 mM IPTG overnight at 37C with 200 rpm shaking. Cells were 372 pelleted, resuspended in 1x binding buffer (50 mM NaPO4, 300 mM NaCl, pH7.4), lysed 373 using 0.1 mm glass beads and a Tissue lyser (Qiagen) for 30 mins at 50 osc -1 min -1 . 374 Purification was carried out using TALON gravity flow resin in 1x binding buffer. Protein 375 was eluted from the resin using step wise concentrations of imidazole in 1x binding buffer 376 (10-500mM imidazole), fractions analysed by SDS PAGE, and fractions containing pure 377 BD1 pooled and concentrated using centrifugal concentrators (Millipore, 10kDa cut-off). 378 Pure concentrated BD1 was buffer exchanged into 1x phosphate buffered saline (1x PBS) 379 using the same centrifugal concentrators. Protein was analysed by SDS PAGE, and quantified 380 using an extinction coefficient of 8480 M -1 cm -1 and MW of 20471.43 Da (based on the 381 sequence of Hia BD1+6xHis tag) 382 383

Glycan array 384
Glycan array slides were printed using OPEpoxy (CapitalBio) activated substrates with the 385 glycan library as previously described (49)  Purified Hia BD1 protein was immobilised onto a CM5 sensor chip amine capture on a 420 Biacore T100 with a contact time of 600 seconds at a flow rate of 5 µL/minute in 10 mM 421 sodium acetate pH 4.5. Glycans were run at the optimised concentrations outlined above. 422 With the analysis performed as outlined above. 423 Peptide binding region identification was performed using a modified version of a previously 424 described method (53), competition assays using immobilised Hia expressing cells and 425 flowed peptides and glycan. E. coli BL21 expressing full-length wild-type Hia were 426 immobilised onto a H1 sensor chip using a ForteBio Pioneer using a contact time of 720 427 seconds at a flow rate of 10 µL/minute in 1x PBS at 1x10 8 bacteria/mL Assays were set up 428 using the NextStep injection feature as previously described (54, 55)