The Vibrio cholerae Colonization Factor GbpA Possesses a Modular Structure that Governs Binding to Different Host Surfaces

Vibrio cholerae is a bacterial pathogen that colonizes the chitinous exoskeleton of zooplankton as well as the human gastrointestinal tract. Colonization of these different niches involves an N-acetylglucosamine binding protein (GbpA) that has been reported to mediate bacterial attachment to both marine chitin and mammalian intestinal mucin through an unknown molecular mechanism. We report structural studies that reveal that GbpA possesses an unusual, elongated, four-domain structure, with domains 1 and 4 showing structural homology to chitin binding domains. A glycan screen revealed that GbpA binds to GlcNAc oligosaccharides. Structure-guided GbpA truncation mutants show that domains 1 and 4 of GbpA interact with chitin in vitro, whereas in vivo complementation studies reveal that domain 1 is also crucial for mucin binding and intestinal colonization. Bacterial binding studies show that domains 2 and 3 bind to the V. cholerae surface. Finally, mouse virulence assays show that only the first three domains of GbpA are required for colonization. These results explain how GbpA provides structural/functional modular interactions between V. cholerae, intestinal epithelium and chitinous exoskeletons.


Introduction
Vibrio cholerae is a Gram-negative bacterial pathogen that causes excessive watery diarrhea in humans [1,2]. The number of reported cases of cholera worldwide averages over 100000 per annum for the last 10 years [1,3], and is presumed to be exceeded by the number of unreported cases [3,4]. Most of these cases occur in countries with poor sanitation [5,6]. V. cholerae strains are classified into more than 200 serogroups, with only the serogroups O1 and O139 possessing epidemic potential. It has been shown that the survival of Vibrio cholerae in the intestine is dependent on its ability to adhere to and colonize cell surfaces [7,8]. In aquatic environments, attachment to fish, crustacea and algae enables the bacteria to obtain nutrients, thereby provide a competitive advantage compared to other free-swimming bacteria [8,9,10]. Adherence to aquatic organisms is believed to involve a different set of genes and recognition molecules as compared to those used for intestinal colonization. For example, plankton surface colonization by the bacteria is more dependent on the mannose sensitive hemagglutinin (MSHA) [11,12].
Recent studies have suggested that Vibrio cholerae secretes a protein that mediates adhesion in aquatic environments, e.g. to plankton, as well as adhesion to human intestinal cells [7,10]. This protein, GlcNAc binding protein A (GbpA) binds to N-acetylglucosamine (GlcNAc)-containing carbohydrates, such as chitin, and is secreted by the type 2 secretion system [7]. In addition to chitin, GbpA has been shown to bind to mucins [13] that also contain GlcNAc as part of their densely packed network of O-linked glycans (reviewed in [14,15]). The importance of GbpA for bacterial colonization has been demonstrated for the O395 [7] and N16961 Vibrio cholerae strains [16] (representing classical [17] and El Tor [18] biotypes, respectively).
Here, we have studied the molecular basis of the function of GbpA as a colonization factor, using a combination of approaches. Firstly, we determined the three-dimensional structure of GbpA using a combination of X-ray crystallography and small angle Xray scattering, revealing an unusual, elongated four-domain fold. Two domains appear to be structurally similar to known chitin binding modules, and we show these domains to be responsible for the ability of GbpA to bind chitin. The other two domains possess distant structural homology to bacterial pili binding proteins and serve to bind the V. cholerae surface. Finally, complementation studies with truncated forms of GbpA identify the domains responsible for mucin binding and virulence in a V. cholerae mouse infection model.

GbpA is a multi-domain protein
Full-length GbpA and a range of truncated forms of GbpA (see Supplemental Table S1 in Text S1 for a summary of all constructs used in this study) were cloned and expressed in E. coli. The mature form of the protein (residues 24 to 485, GbpA fl ) was extracted from the periplasm and purified by chromatography. Two C-terminally truncated forms of mature GbpA, comprising amino acids 24-203 (GbpA D1 ), and 24-414 (GbpA D1-3 ), were also produced. Two N-terminally truncated forms of GbpA comprising amino acids 210-414 (GbpA D2-3 ), and 423-485 (GbpA D4 ) were expressed as GST fusion proteins and purified to homogeneity.
Although extensive efforts were made to crystallise full-length GbpA, GbpA D1-3 was the longest construct that could be crystallised. The GbpA D1-3 structure was solved by SAD phasing, and refined against 1.8 Å synchrotron diffraction data (Supplemental Table S2 in Text S1). The structure of GbpA D1-3 reveals three distinct domains rich in b-structure, linked by disordered loops (Figure 1A,B). These three domains, and their similarities to previously reported protein structures, are discussed separately. Domain 1 is structurally similar to a CBM33 chitin binding protein Domain 1 (GbpA D1 ) comprises a four-stranded and a threestranded b-sheet forming a b-sandwich ( Figure 1B,C). Between bstrands 1 and 3, a 65-residue loop forms a pseudo-domain, which consists of short a-helices, a b-strand, and loops. There are also two disulphide bonds formed: one in the loop/helical region (Cys-42 and Cys-56) and one linking b-strands 4 and 5 (Cys-152 and Cys-169) ( Figure 1B). A structural similarity search with the Dali server [19] identified chitin binding protein 21 (CBP21) from S. marcescens, which is part of the CAZy Carbohydrate Binding Module family 33 (CBM33 [20]) as a structural homologue ( Figure 2, RMSD = 0.9 Å for 168 Caatoms, 47% sequence identity). Previous work has shown that CBP21 facilitates chitin degradation in the presence of chitinases [21]. Previous work [21,22] has shown that residues Y54, E55, E60, H114, D182 and N185 are important for chitin binding by CBP21. Interestingly, most of these are conserved in GbpA domain 1 (Figure 2), except N185, equivalent to GbpA A191. Y54 in CBP21 has been shown to play a role in determining binding specificity as well as affinity for different types of chitin [22]. Residue 191 of GbpA is an alanine, and mutation of the equivalent N185 in CBP21 to alanine resulted in a three-fold reduction in chitin affinity [22]. A noticeable difference between CBP21 and GbpA D1 are the lengths of two loops: one located between b-strand 1 and a-helices 1 (amino acids 29 to 39), and another between a-helices 1 and ahelices 2 (amino acids 44 to 56) of GbpA ( Figure 1B Figure 1C). These loops form a continuous patch on the opposite side from the proposed chitin binding face of domain 1 (Figure 2).

GbpA domains 2 and 3 possess distant structural similarity to bacterial surface proteins
Structural similarity searches with DALI revealed only distant structural similarities between domain 2 of GbpA and the bdomain of the flagellin protein p5 (Z = 4.0, RMSD = 2.9 Å for 68 Ca atoms, Figure 2). According to Maruyama et al., this domain interacts with the bacterial surface, and functions to project an alginate binding domain of the protein from the cell surface [23]. A more significant structural match was observed for domain 3 of GbpA. This domain has an immunoglobulin fold ( Figure 2) with significant structural similarity to over 200 structures in the PDB. The best hit was SFAE (Z = 9.8, RMSD = 2.4 Å for 91 Ca atoms), a chaperone that functions to fold and transport components of Escherichia coli surface pili to the cell surface [24]. Structures similar to SFAE have been solved in complex with pili subunits; for instance, the protein FimC (the 4th best DALI hit, Z = 9.4, RMSD = 2.2 Å for 91 Ca atoms, Figure 2) has been solved in complex with the pili subunit FimH [25,26] (Figure 2).

Domain 4 is similar to a chitinase chitin binding domain
The crystal structure of GbpA D1-3 lacks the C-terminal domain 4, comprising residues 415-485 ( Figure 1). Sequence alignments revealed that this domain shows 26% sequence identity with the C-terminal chitin-binding domain of S. marcescens chitinase B (SmChiB), whose structure is almost devoid of helices and contains 32% b-strands (Supplemental Figure S1 in Text S1, [27]). Indeed, when this GbpA domain (GbpA D4 ) was expressed and purified, circular dichroism experiments showed that it also possesses an essentially all-b secondary structure (Supplemental Figure S1 in Text S1). Interestingly, the sequence alignment suggests that only one (W479) of two aromatic residues (W479 and Y481 [27]) that are thought to be important for SmChiB-chitin interactions is conserved in GbpA (W463). However, the aromatic residues that form the hydrophobic core of the SmChiB chitin binding domain (Y470, Y473, W492) are conserved in GbpA. Taken together, it is likely that GbpA possesses an additional C-terminal domain 4, with structural similarity to the SmChiB C-terminal chitin binding domain.

GbpA exists as an elongated monomer in solution
To determine the position of domain 4 in the context of the complete GbpA structure, we studied the solution shapes of full length GbpA and GbpA D1-3 with Small Angle X-ray Scattering

Author Summary
Vibrio cholerae is the bacterium that causes cholera, a disease endemic in developing countries with poor sanitation. The bacterium colonizes aquatic organisms that serve as a reservoir of transmission to humans. Our work has focused on GbpA, a protein that is secreted by V. cholerae and appears to facilitate growth of the bacteria both in the human intestine and on the exoskeletons of marine organisms. We show that the protein possesses an unusual three-dimensional structure consisting of four separate domains. Two of the domains are similar to proteins that are known to bind chitin, an exoskeleton biopolymer, and our data show that these domains indeed harbour the chitin binding properties of GbpA. One of these domains is also capable of binding intestinal mucus. The two remaining domains are required for interacting with the bacterium itself, creating a stable interface between the bacterium and the human/marine host, facilitating colonization. Finally, work with a cholera mouse model shows that only the first three domains of GbpA are required for colonization. These results show how GbpA provides structural/functional modular interactions between V. cholerae, the intestinal epithelium and chitinous exoskeletons.
(SAXS). The scattering curves yielded estimated molecular masses for GbpA fl (6065 kDa) and GbpA D1-3 (4665 kDa) samples that are compatible with their monomeric structures (theoretical masses of 54 kDa and 44 kDa, respectively). This finding is further corroborated by the excluded volumes of GbpA fl (100610 nm 3 ) and GbpA D1-3 (80610 nm 3 ), since for sufficiently large globular proteins the hydrated volume in nm 3 should numerically be about twice the molecular mass in kDa [28]. The experimental radius of gyration R g (3.9060.05 nm for GbpA fl and 3.5560.05 nm for GbpA D1-3 ) and maximum diameter D max (13.561.0 nm for GbpA fl Figure 2. Comparisons of domains 1-3 of GbpA with structural homologues. The space filled models for each domain of the crystal structure of GbpA D1-3 are shown superimposed onto the corresponding ribbon models. For each ribbon models, the secondary structure was coloured in the same scheme as Figure 1B. Each domain of GbpA D1-3 was parse through a structural alignment server (DALI, [19]). The protein domains giving the best alignment to each domain of GbpA are presented on the right as space filled models superimposed onto their corresponding ribbon models. The individual domains of GbpA are also presented in the same orientation as the structural homologues. For the first row CBP21 (2BEM.pdb) aligned the best to Domain 1 of GbpA. The magenta-coloured surfaces of domain 1 and CBP21 indicate areas of high sequence conservation, with side chains specifically discussed in the text shown as sticks and labelled. The position of CBP21 Y54 (and the GbpA equivalent Y61) that is critical for chitin binding is coloured in pink. Arrows indicate the position of the extended loops in GbpA that are absent in CBP21. For the second row, the flagellin protein p5 (2ZBI.pdb) showed the closest structural alignment to domain 2 of GbpA. For the third row, the pili-binding chaperone FimC (1QUN.pdb) was shown to be structurally similar to domain 3 of GbpA. The structure of FimC was solved in complex with part of the Type 1 pili FimH. In the figure, FimH is shown in magenta. doi:10.1371/journal.ppat.1002373.g002 Figure 1. Structure of GbpA. A. Schematic representation of the functional domains of GbpA. The residue boundaries for each domain of the determined structure of GbpA are numerically labelled above the diagram. The N-terminus SP box refers to the signal peptide that signals the protein for secretion through the type-2 secretory pathway. B. Stereo images of the crystallised structure of GbpA D1-3 . The structure is coloured according to the secondary structure for each domain. For domain 1, the a-helices are coloured blue, b-sheets are coloured in red. For domain 2, the a-helices are coloured in brown, and the b-sheets are coloured in cyan. For domain 3, the b-sheets are coloured in yellow. All strands are coloured in grey. The residues defining domain boundaries are labelled accordingly. The disulfide bonds are coloured as two yellow coloured sticks. The first and last bstrands for each domain, and all a-helices are labelled. C. Topology diagram of GbpA D1-3 drawn using TOPDRAW [52]. The b-stands and the a-helices are numbered sequentially. D. The ab initio SAXS model of GbpA fl (gray spheres) is superimposed onto the structure of GbpA D1-3 and the modelled structure of GbpA D4 (ribbon models) as determined by rigid body refinement using SASREF. doi:10.1371/journal.ppat.1002373.g001 and 11.561.0 nm for GbpA D1-3 ) ) indicate that the proteins behave as extended particles. The solution shapes of GbpA fl and GbpA D1-3 were reconstructed ab initio using the program DAMMIF [29], with good discrepancy factors (x = 1.16 for GbpA fl and x = 1.22 for GbpA D1-3 ). For each protein, ten independent reconstructions produced similar shapes and these were averaged using DAMA-VER [30]; the ab initio structure is represented in Figure 1D as a wired mesh. To determine the relative orientations of the domains, a model of GbpA fl was generated by rigid-body refinement with the separate GbpA domain structures taken from the domain 1-3 crystal structure ( Figure 1B), and the model of domain 4 (Supplemental Figure S1 in Text S1), fitting simultaneously to the experimental SAXS data of GbpA D1-3 with x = 1.41 and of GbpA fl with x = 1.53. GbpA fl and GbpA D1-3 adopt a rod shape with the domains twisted along the long axis ( Figure 1D). There are no direct inter-domain interactions, and the surface of each domain appears to be completely exposed to the external environment, suggesting a certain degree of flexibility in the orientation of each domain of GbpA.

GbpA selectively binds chito-oligosaccharides through domains 1 and 4
Previous work has suggested that GbpA binds N-acetylglucosamine (GlcNAc) sugars [7]. To investigate whether this extends to GlcNAc-containing glycans, we studied the binding specificity of GbpA by screening a well-established library of mammalian N/Olinked glycans that also includes a range of linear oligosaccharides [31,32]. Interestingly, GbpA fl selectively binds chito-oligosaccharides of varying length (Figs. 3A, S5) To quantify chitin binding and identify the domains responsible, direct binding assays were carried out with a range of polysaccharides ( Figure 3B). GbpA D4 binds to all chitin forms tested with the highest affinity for amorphous forms of chitin (colloidal chitin and chitin beads). GbpA D1 shows significant binding only to aand b-chitin ( Figure 3B). GbpA D2-3 does not bind chitin. For GbpA fl strong binding to all forms of chitin was observed. None of the GbpA proteins showed binding to GlcNAc-beads or cellulose. Although this is in apparent contrast with a previously published report showing that GbpA-expressing V. cholerae bind to GlcNAc beads [7], it could also suggest that additional bacterial factors may be required for the reported GlcNAc binding. Together, these data show that the two terminal chitin-binding domains in GbpA endow the protein with the ability to bind different types of GlcNAc oligomers and polymers.
Domain 1 of GbpA is required for mucin binding V. cholerae GbpA has been shown to interact with intestinal mucin [13]. To identify the domain(s) involved, GbpA fl and its truncated variants were tested for their ability to bind to mucin, mucin coupled to sepharose beads, intestinal epithelial cells and brush border membranes of the intestine (Table 1). GbpA with a Tyr61Ala (GbpA (Y61A) ) mutation was also tested to assess whether Tyr61 is important for mucin binding as has been shown for chitin binding in the case of the equivalent Tyr54 in CBP21. Interestingly, only GbpA fl , GbpA (Y61A) , and GbpA D1-3 bind to mucin with mM affinity, further confirmed by ELISA for those domains that tested positive in the initial mucin-binding test. Thus, domain 1 is essential for mucin binding and domain 4 is dispensable for mucin binding. Tyr61 on domain 1 of GbpA is not essential for mucin binding.
To further investigate mucin binding to domain 1, competitive mucin binding assays were carried out with recombinant truncated versions of GbpA and wild type V. cholerae (N16961) ( Figure 4A). Increasing concentrations of purified GbpA variants added to mucin-coated wells showed varying abilities of the proteins to prevent bacteria from binding the immobilized mucin. GbpA fl , GbpA (Y61A), GbpA D1 and GbpA D1-3 could inhibit binding of N16961 to mucin in a concentration-dependent, saturating manner ( Figure 4A). At saturation, GbpA fl and the GbpA (Y61A) mutant inhibited binding of V. cholerae to mucin by 83%. Truncated GbpA variants comprising only domains 2, 3, and/or 4 could not inhibit the binding of V. cholerae to mucin, in agreement with the direct binding assays.

GbpA domains 1-3 are essential for intestinal colonization and pathogenesis
In order to understand which domains of GbpA are essential for colonization, we first studied the surface expression of these domains in complemented GbpA knockout V. cholerae strain N16961 (N1RB3 or DgbpA). Immunoblot analysis revealed that the DgbpA strain complemented with GbpA fl , GbpA(Y61A), or GbpA D1-3 displays these proteins on the bacterial surface (Supplemental Figure S2 in Text S1). For the strains complemented with GbpA D1 , GbpA D2 , GbpA D2-3 , or GbpA D4 , no protein could be detected on the bacterial cell surface (Supplemental Figure S2 in Text S1). These results suggest that the presentation of domain 1 on the cell surface is dependent on the presence of domains 2 and 3.
To probe the role of domains 1-3 in intestinal colonization, the pathogenicity of the complemented strains was analyzed by studying intestinal fluid accumulation and colonization in a mouse model ( Figure 4B and Figure 5A). All complemented strains were similar to the wild type strain, with the exception of the GbpA D2-3 complemented strain (N1RB3-GbpA D2-3 ), which was affected in its ability to colonize the mice, similar to the DgbpA N16961 control (N1RB3).

Domains 2 and 3 of GbpA interact with V. cholerae surface
As shown earlier, GbpA domains 1 and 4 are mainly responsible for chitin, mucin and intestinal epithelium binding. However, GbpA is a secreted protein, and some of the domains must therefore be capable of interacting with V. cholerae surface to form a stable host-pathogen interface. To test this, we investigated binding of recombinant full length, mutant and truncated versions of GbpA to the DgbpA strain ( Figure 5B). The results suggest that domains 2 and 3, but not domains 1 or 4, are required for binding to the bacterial surface.

Discussion
Since the discovery of GbpA, little progress has been made towards an understanding of the molecular properties governing the protein's ability to mediate interactions between the bacteria and the host surfaces, both in the marine environment and the mammalian host. While it has been reported that the protein binds to chitin and intestinal epithelia, the chitin specificity was not probed, nor was it understood how the protein interacts with intestinal epithelial cells. Here we have reported the crystal and solution structures of GbpA, showing an unusual 4-domain elongated structure. Remarkably, the modular domain structure translates to modular interaction properties with different substrates/surfaces. For instance, the first domain has a pseudofibronectin type 3 fold that shares considerable similarity with the chitin-binding protein CBP21. Our chitin binding studies showed that this domain, like CBP21, binds to a/b chitin. Furthermore, we were able to demonstrate the interaction of the domain 1 with mucin, a key surface component of intestinal epithelial cells. Thus, the first domain of GbpA harbours the ability to bind to both mucin and certain types of chitin. It is not uncommon that carbohydrate binding modules (CBMs) bind to several substrates or that the same CBM scaffold is used for various binding specificities [20].
For GbpA domains 2 and 3, virtually no sequence similarity to any proteins of experimentally defined function exists. Obtaining the crystal structure of these two domains has given suggestions as to their possible function. Domains 2 and 3 bear some resemblance to proteins such as SFAE and FIMC, chaperones that interact with pili and form part of the chaperone-usher pathway of pili biosynthesis. Both these chaperones reside in the periplasm, but in the case of FimC, reports have shown that the proteins can interact with pili subunits, such as FimH, in vitro [33,34]. Interestingly, our in vivo studies show that domains 2 and 3 of GbpA are required for interaction with the bacterial surface. It is also possible that the two domains may facilitate transportation of GbpA to the cell surface via the type II secretion system.
The structure of domain 4 has so far not been obtained, despite numerous attempts to crystallize either the full-length GbpA, this domain alone or as a complex with short chain chitin. The sequence of the domain is distantly related to the chitin binding domain of SmChiB (Supplemental Figure S1 in Text S1) suggesting a chitin binding function [7]. In our chitin binding assay, we show that domain 4 has greater binding affinity to colloidal chitin and to chitin beads than domain 1. However, domains 1 and 4 together conferred greater binding to chitin of all forms. In contrast to a previously published report [7], no binding of domains 1 and 4 was observed to single GlcNAc residues. Also, domain 4 cannot bind to mucin and is dispensable for V. cholerae colonization of intestinal epithelial cells and pathogenesis.
We also attempted to identify the core glycan unit of mucin that GbpA interacts with; however, none were detected in the glycomics screen that includes simple single O-linked core structures that are commonly found present on mucins. It is possible that GbpA binding to mucin depends on the glycanprotein linkage, as is not uncommon for other glycan binding proteins [35], or the binding avidity of GbpA for mucin that cannot be simulated on the glycan binding array [36].
In light of our results, we propose the following mechanism for GbpA-mediated bacterial colonization of intestinal epithelial cells. V. cholerae constitutively produces GbpA prior to colonization, mainly as a secreted protein. Once in the host intestine, GbpA attaches itself via domain 1 to mucin, thereby marking the surface for V. cholerae colonization. Domains 2-3 of the protein then bind to the V. cholerae surface, enhancing microcolony formation. In a marine environment, domains 1 and 4 provide GbpA with versatile binding properties to different types of crystalline chitin, allowing the bacterium to attach to, and colonize, a range of crustacea.

Ethics statement
All animal experiments were conducted following the standard operating procedure as outlined by Committee for the Purpose of Supervision and Control of Experiments on Animals (CPCSEA) Govt. of India. The animal experimental protocol was approved by the Institutional Animal Ethics Committee of National Institute of Cholera and Enteric Diseases (NICED) (Registration No. 68/ 1999/CPCSEA dated 11-03-1999). New Zealand white rabbits weighing about 2 kg were used for intestinal mucin preparation. 4 to 5 days old BALB/c mice were used for fluid accumulation studies. For intestine harvesting, the animals were euthanized in a CO 2 -chamber. All efforts were made to minimize suffering during euthanasia.

Cloning and sequencing
The gene coding for full-length gbpa (1-485 amino acids, accession number VCA0811) was cloned from genomic DNA of V. cholerae into the plasmid pET-22b between the NdeI and XhoI sites (Novagen). The primers used to amplify the gene were 59-GCGGAATTCCATATGAAAAAACAACCTAAAA TG ACCG-C-39 for the forward primer carrying the NdeI restriction site, and 59-CCTCGAGTCATTAACGTTTATCCCA CGCC ATT TCC-C-39 with the XhoI restriction site. From this clone, several truncated versions of GbpA were made either by introducing a stop codon in the original expression vector (pET22b) through the use of the QuickChange mutagenesis kit (Stratagene, La Jolla, CA), or by subcloning into pGEX6P vector (GE Healthcare) through the BamHI and XhoI restriction sites. All constructs used are summarised in Supplemental Table S1 in Text S1. Constructs in the pET22b vector (containing the natural transit signalling peptide for secretion) include GbpA comprising of the following amino acids: 1-485 , 1-414, and 1-203. The mature forms of these proteins produced by the bacteria would comprise of the following amino acids

Protein production
The constructs of GbpA were transformed into E. coli C43. To induce protein expression, the bacteria were grown in LB medium to an OD 600 of 0.6 before induction with isopropyl-b-Dthiogalactoside (IPTG) at a final concentration of 0.2 mM, and incubation for 16 h at 20uC. Depending on the vector used (pET22b or pGEX6P), two methods were employed to extract the recombinant protein from the bacterial lysate. GbpA fl , GbpA D1-3 , and GbpA D1 were prepared from a periplasmic fraction prepared by osmotic shock. The method involved centrifuging (45506g) the cells and then resuspending them in periplasmic buffer (200 mM Tris/HCl pH 7.5, 20% sucrose, 1 mM EDTA and 0.5 mg/ml lysozyme) for 30 minutes at room temperature. The suspension was then cold shocked on ice for 10 minutes, and the periplasmic content isolated by centrifugation at 4000 g (4uC). The proteins  involves harvesting the induced cells by centrifugation (4550 g), and subsequently resuspending the bacteria in lysis buffer (25 mM Tris/HCl pH 7.5, 150 mM NaCl). Lysozyme (0.1 mg/ml) and DNAse (0.1 mg/l) were added, and the mixture was incubated on ice for 20 minutes. The lysate was then sonicated before centrifugation at 50000 g for 30 minutes. The soluble fraction was collected, passed through a 0.2 mm filter, and incubated with glutathione beads (pre-equilibrated with lysis buffer) for 2 h. Beads containing GbpA were washed with lysis buffer followed by digestion with PreScission protease. Proteins were further purified by gel filtration (26/60 Superdex 200) in lysis buffer.

Crystallization, structure solution and refinement
Purified GbpA D1-3 was concentrated to 25 mg/ml, and used for sitting drop vapor diffusion crystallization experiments using a mother liquor containing 0.2 M Mg(HCO 3 ) 2 , 50% (w/v) PEG 3350, 3.33% (w/v) D-sorbitol. For phasing, crystals were soaked with mother liquor containing 20 mM zinc chloride for 12 h. Crystals were cryoprotected with mother liquor containing 5% glycerol (v/v) and then frozen in a nitrogen cryostream. Data were collected on beamline BM14 at the European Synchrotron Radiation Facility and processed with the HKL suite [37] (Supplemental Table S2 in Text S1). Six zinc atoms sites were located by SOLVE, yielding phases to 2.25 Å with an overall figure of merit of 0.30 (Supplemental Table S2 in Text S1) [38]. Combination of the SAD phases with the native amplitudes, solvent flattening and two-fold averaging using DM [39] resulted in an interpretable map. WarpNtrace [40] was used to build a total of 624 (our of 782) residues. The model, excluding 203-208 and 313-318, was completed using Coot [41] interspersed with refinement with REFMAC [42] (Supplemental Table S2 in Text S1). The asymmetric unit contained two monomers, with an RMSD of 0.6 Å . In the interest of simplicity, monomer A was used throughout the data analysis and discussion in this manuscript.

Small angle X-ray scattering (SAXS)
Synchrotron SAXS data of GbpA were collected according to the standard protocols established at the X33 beam line at the Deutsches Elektronen-Synchrotron DESY (Hamburg, Germany), as detailed in the supplementary material. Briefly, twenty lowresolution models of GbpA fl were built by the program DAMMIF [29] and averaged. A model of GbpA fl was also constructed by rigid body modelling using the program SASREF [43], employing the individual GbpA domain structures.

Glycan and chitin binding experiments
Recombinant GbpA fl and truncated forms of GbpA were screened for glycan binding against a library of 264 natural and synthetic glycans (100 mM) with amino linkers, and printed onto chemically-modified glass microscope slides. The facility to carry out the experiment (Core H, Printed Array Version 2) was kindly provided by the Consortium of Functional Glycomics (http:// www.functionalglycomics.org). The method used to prepare the samples for glycan binding analysis has been reported previously [44]. Six measurements were made and the values averaged (reported here as average Relative Fluorescence Units). Additional results from the screen are available from the supplementary section (Supplemental Figure S2 in Text S1).

Chitin binding experiments
GbpA fl , GbpA D1-3 , GbpA D1 , GbpA D2-3 , and GbpA D4 (all 5 mM) in PBS buffer (pH 7.5) were incubated for 24 hours at 21uC with various chitin substrates (5 mg/ml) in a total volume of 300 ml per sample. All samples were run in triplicate, including controls that contained only buffer and substrate. The chitin substrates used were a-chitin (from shrimpshells, Hov-Bio, Tromsø, Norway), b-chitin (from squid pen, France Chitin, Marseille, France), colloidial chitin (made from crab a-chitin from Sigma), chitin-beads (New England BioLabs), GlcNAc-coated agarose beads (Fluka), and Avicel (microcrystalline cellulose, Sigma). The binding experiment was carried out in Eppendorf tubes where protein and substrate were mixed by axial rotation. After incubation, samples were centrifuged for 5 minutes at 25,5006 g, followed by separation of the pellet (substrate) and supernatant. The protein concentration of the supernatant (unbound protein) was estimated using the Bio-Rad Bradford microassay. The substrate pellets containing bound protein were washed twice with 1.5 ml of PBS, and subsequently boiled for four minutes in 50 ml SDS-PAGE sample buffer. After centrifugation proteins bound to the substrates, now solubilized and denatured were analyzed by SDS-PAGE, using Coomassie proteinstain and methanol/acetic acid (10% v/v) de-stain for visualization of the protein bands.
Binding studies of GbpA with mucin by ELISA Rabbit mucin was prepared by intestine scraping as described previously [13]. Briefly, mucin was prepared from by isopyknic ultracentrifugation in cesium chloride [45]. Primary intestinal epithelial cells (IEC) and their brush borders were isolated by previously described methods [46,47]. IEC were used at a concentration of 2610 6 cells/ml and were stimulated for different time points in 5% CO 2 at 37uC with recombinant GbpA fl and truncations of GbpA. 100 ng of rabbit mucin was coated on the wells of a microtitre plate. The coated plate was kept at 4uC overnight. The unbound mucin was washed with phosphate buffered saline containing 0.5% tween-20 (PBS-T) the next morning. The wells were then blocked with 5% milk in PBS-T for 2 hours. 1-25 mM of recombinant GbpA or its mutant proteins was prepared in PBS, and then applied into the wells in triplicate. After 1.5 hours of incubation, unbound proteins were washed with PBS-T, and anti-GbpA antibody was added to the wells in 1:250 dilution (prepared in 5% non-fat milk containing PBS-T), prior to incubation for 1.5 hours. Excess antibody was washed in PBS-T, and HRP conjugated anti-mouse antibody (1:400 dilution) in 5% non-fat milk containing PBS-T was added, and further incubated for 45 minutes. The unbound secondary antibody was washed with PBS-T, and o-phenylenediamine H 2 O 2 was applied to develop the colour. The absorbance of each well was measured at 492 nm using an automated ELISA reader (Beckman Coulter). The absorbance reading and the maximum binding was assigned the value 1. Data fitting was done using Kyplot version 2.0 Beta15 (32 bit) to obtain the best-fit curves and to obtain the dissociation constant (Kd). Values are the means of triplicate determinations from two separate experiments.
Binding studies of GbpA with immobilized mucin Rabbit mucin was diluted in binding buffer (0.2 M NaH-CO 3 +0.5 M NaCl, pH 8.3) at a protein concentration of 2.6 mg/ ml, and then coupled to NHS-activated sepharose resin (GE Healthcare). The unbound mucin was washed with buffer A (0.5 M triethanolamine+0.5 M NaCl pH 8.8) followed by buffer B (0.2 M NaOAc+0.5 M NaCl pH 3.8) following manufacturer's protocol. Different amounts of GbpA fl , truncations or its mutants were added in equal volumes to 400 ml of mucin-coupled resin, and the mixtures were incubated for 2 hours at room temperature with shaking. Unbound protein was removed by centrifugation and was measured by protein assay reagent. K d was calculated by plotting bound vs. bound/free for each fraction in a Scatchard plot.
Inhibition of bacterial binding to mucin 100 ng of rabbit mucin was coated on the wells of a microtitre plate as described above. The wells were then blocked with 5% milk in PBS for 2 hours. Different concentrations of GbpA fl and truncations of GbpA were applied to the wells in triplicate. After 1.5 hours of incubation unbound proteins were washed with PBS. V. cholerae N16961O1 El Tor was biotinylated as described previously [48]. 1610 4 CFU/ml of biotinylated V. cholerae were added to each well. After 1.5 hours of incubation, unbound bacteria were washed with PBS, and HRP-conjugated avidin (1:250 in PBS) was applied to each well. The unbound HRP conjugated avidin was washed with PBS, and o-phenylenediamine H 2 O 2 was added. The absorbance of each well was measured at 490 nm.

Transformation of gbpA and its mutant constructs into N1RB3 strain
The gbpA knockout strain, N1RB3, was grown overnight as described previously [13]. Briefly, 100 ml of overnight culture were inoculated into 100 ml of LB, and incubated at 37uC to A 600 = 0.5. The cells were harvested, and washed five times in total 100 ml of sucrose buffer (272 mM sucrose, 1 mM HEPES, 10% glycerol, pH 8). Subsequently, the cells were resuspended in 1/100 of its original volume in buffer containing 272 mM sucrose and 10% glycerol. 50 mL of this cell suspension were used for electroporation with 500 ng of each construct. Pulse conditions were as follows: 2.5 kV voltage, 25 mF capacitance, 200 V resistance, and time 3-3.5 ms. After electroporation, cells were resuspended in 950 mL of SOC medium, and grown for 1 hour at 37uC. Then 200 mL of each cell suspension was plated on ampicillin-LB agar, and the plates were incubated for 24 hours at 37uC.

Isolation of the outer membrane proteins (OMP)
Bacterial cells were harvested from 150 ml of culture of different complemented N1RB3 strains [49]. The cells were washed twice with 0.1 M HEPES (pH 7) before resuspension in 7.5 ml in the same buffer. Then, each preparation was sonicated (intermittent pulse of 15 seconds for 18 pulses) whilst maintaining temperature at 4uC to disrupt the cells. Intact cells were removed by centrifugation at 70006 g for 10 minutes. Each supernatant was further centrifuged at 1000006 g for 1 hour, and the pellet was resuspended in 0.5% (w/v) N-laurylsarcosine-Na salt for 15 minutes with gentle agitation. As before, each sample was centrifuged at 1000006 g for 1 hour to collect the outer membrane protein as pellet. The pellets were resuspended in 0.1 M HEPES.

Detection of GbpA on the surface of V. cholerae
Transformation of GbpA fl , truncations of GbpA, and its mutant constructs into N1RB3 was performed as reported previously [13]. For ELISA surface expression tests, the complemented strains were grown and fixed, as described previously [50]. The wells were coated with 10 7 bacterial cells overnight at 4uC. Wells were blocked with 5% non-fat milk in PBS-T. After 2 h of blocking, antibodies against different GbpA truncation mutants were added to the well at a 1:300 dilution. Excess antibodies were washed off with PBS-T, and HRP-conjugated anti mouse IgG (1:800) was added. The reaction was developed with o-phenelynediamine and H 2 O 2 . The absorbance was read at 490 nm.

Intestinal colonization and fluid accumulation assays
Fluid accumulation was assayed as described previously [51]. The assay is a technique used to measure the amount of water and electrolyte accumulation when V. cholerae colonizes the intestinal and induces severe diarrhea. Briefly, 4 to 5 days old BALB/c mice were intragastrically inoculated with bacterial inoculum (1610 5 CFU) of the appropriate V. cholerae strain. Infected mice were sacrificed after 4, 8, 12, 16, 20 and 24 hours post-infection in a CO 2 -chamber. The mice were weighed, and their entire intestines were removed. Each of the separated intestines was weighed, and the FA ratios were calculated as described earlier [51]: FA ratio = intestinal weight/(whole body weight-intestinal weight). PBS-fed mice were used as negative control. All animal experiments were conducted following the guidelines of the Institutional Animal Ethical Committee.
Binding studies of full length, mutant and truncated GbpA to N1RB3 N1RB3 was grown overnight as described previously [13]. The bacteria were washed twice with PBS. The bacteria were fixed with 0.5% formalin as described previously [50]. After overnight fixation, the bacteria were again washed twice with PBS, and then resuspended in PBS. These bacteria were diluted to 10 8 CFU/ml (OD 600 = 0.1), and 100 mL (10 7 cells) from this suspension was coated in each well, and incubated overnight at 4uC. GbpA fl , truncations of GbpA, and its mutants were applied to the wells at a fixed concentration of 400 ng. After 1.5 hours of incubation, polyclonal antibodies against full length, mutant or truncated versions of GbpA were added to the wells at a concentration 1:300 (v/v) in PBS-T. The amount of recombinant GbpA bound was detected with HRP-conjugated anti-mouse IgG (H+L) at a concentration 1:800 and o-phenelynediamine+H 2 O 2 . The absorbance at 490 nm was measured in a microplate reader (BIORAD 550 CA).

Accession numbers
The Protein Data accession number for the coordinates and structure factors of GbpA is 2XWX.

Supporting Information
Text S1 Supplementary data. Three figures (S1-3), two tables (T1-2) and additional Materials and Methods providing additional data to the experiments described in the main text. (DOC)