Improved expression and purification of the Helicobacter pylori adhesin BabA through the incorporation of a hexa-lysine tag.

Helicobacter pylori is a pathogenic bacterium that has the remarkable ability to withstand the harsh conditions of the stomach for decades. This is achieved through unique evolutionary adaptations, which include binding Lewis(b) antigens found on the gastric epithelium using the outer membrane protein BabA. We show here the yield of a recombinant form of BabA, comprising its putative extracellular binding domain, can be significantly increased through the addition of a hexa-lysine tag to the C-terminus of the protein. BabA was expressed in the periplasmic space of Escherichia coli and purified using immobilised metal ion affinity and size exclusion chromatography - yielding approximately 1.8 mg of protein per litre of culture. The hexa-lysine tag does not inhibit the binding activity of BabA as the recombinant protein was found to possess affinity towards HSA-Lewis(b) glycoconjugates.


Introduction
Helicobacter pylori is a microaerophilic gram-negative bacterium that is estimated to infect over half of the world's population [1]. This bacterium, which displays a very strict tissue tropism in humans for gastric mucosa, is an established cause of chronic gastritis and a major risk factor for the development of peptic ulcers and gastric cancer [2,3]. Using adhesion proteins to attach directly to the gastric epithelium, H. pylori is able to evade gastric mucus turnover and colonise the human stomach for several years [4].
One of the best characterised adhesins is the blood group antigen-binding adhesin (BabA) 1 [5]. The native $80 kDa protein is predicted to have similar domain structures to other outer membrane proteins from the Hop (Helicobacter outer membrane porins) superfamily [6]. This consists of an extracellular N-terminal domain, involved in binding to host receptors, and a transmembrane C-terminal domain, predicted to form a b-barrel structure similar to crystallised porins [7]. It is well established that the bacteria use the BabA protein to bind to the Lewis b (Le b ) antigen, which is an oligosaccharide abundantly present in the gastric mucosa of the majority of the Western population [8][9][10]. As a result of its important role in gastric adherence, BabA plays a significant role in enhancing H. pylori virulence [11], and may represent a potential target for eradication therapy.
Given its clinical relevance, techniques to express recombinant BabA are of high importance to support biophysical and structural studies. Recently, there have been a number of reports describing successful expression strategies, all of which have used Escherichia coli as a host. These have included refolding from inclusion bodies [12], expression as a fusion protein using glutathione S-transferase (GST) as a solubility enhancing partner [13], and secretion to the periplasmic space [14]. However, each of these approaches has drawbacks. Refolding from inclusion bodies is time-consuming and proteins may not regain full function [15,16], GST-fusion proteins often require partner cleavage before NMR and crystallographic structure determination [17], and periplasmic expression is associated with low yield [18].
Indeed, our initial attempts to recombinantly express the N-terminal extracellular domain of BabA were challenging. Over-expression in the cytoplasm of E. coli led to the formation of insoluble aggregates and secretory expression in eukaryotic hosts was unsuccessful. Although fusion with a pelB leader sequence [14] resulted in expressed protein in the periplasmic space, the BabA fragment lacked solubility and we failed to obtain high yields of purified protein using this strategy.
We therefore investigated the use of a polycationic amino acid tag to increase protein solubility. Recent studies have shown that incorporating up to ten residues of arginine or lysine, at one terminus of a recombinant protein, can greatly improve solubility [19][20][21][22]. We show here that the addition of a hexa-lysine tag at the C-terminus significantly enhances the expression and purification of BabA and does not interfere with Lewis b binding. This strategy will facilitate structural and biophysical characterisation of the H. pylori outer membrane protein BabA, and thus, provide insight into H. pylori adherence mechanisms.

Experimental procedures
Bacterial strains and plasmids E. coli XL10 Gold cells (Agilent Technologies, USA) were employed both for vector construction and protein expression using a modified pOPE101 expression vector (Progen Biotechnik, Germany).

Expression vector construction
Genomic DNA isolated from H. pylori J99 was used in a polymerase chain reaction to amplify a 1581 bp BabA gene fragment encoding its extracellular/putative binding domain (amino acids 21-547; GenBank accession no. AAD06409.1). The first 20 amino acids were excluded as they encode a cleavable signal sequence for translocation to the H. pylori outer membrane through the Sec pathway, while amino acids 548-744 are predicted to be involved in the formation of the transmembrane b-barrel domain [7,14]. Gene amplification was achieved using KOD DNA polymerase (Novagen, Germany) and primers were designed to introduce the necessary restriction sites (Table 1) for incorporation into the pOPE101 expression vector (Progen Biotechnik, Germany).
Following amplification, PCR products were gel-purified using a QIAquick Gel Extraction Kit (Qiagen, USA) then double-digested alongside the pOPE101 expression vector using NcoI and BamHI restriction enzymes (New England Biolabs, USA). The BabA gene fragments were ligated into the expression vectors using T4 DNA ligase (New England Biolabs, USA) and transformed into chemically competent E. coli XL10 Gold cells for plasmid propagation. After verification of successful transformation using colony PCR, plasmid constructs were purified using a QIAprep miniprep kit (Qiagen, USA) and sequenced by Source BioScience (Nottingham, UK), to verify homology to the BabA J99 gene.
Protein expression E. coli XL10 Gold cells containing BabA 547 or BabA 547 -6K expression constructs were grown in lysogeny broth (supplemented with 100 lg/mL ampicillin, 12.5 lg/mL tetracycline and 0.1 M glucose) at 37°C until the OD 600 reached $0.6. At this point, the temperature was reduced to 24°C and expression was induced with isopropyl b-D-1-thiogalactopyranoside (IPTG) to a final concentration of 0.1 mM. After 16 h, cells were harvested through centrifugation and an osmotic shock procedure was used to extract proteins from the periplasmic space [23].
Periplasmic extracts were boiled in a non-reducing SDS sample buffer and separated by SDS-PAGE on a 4-20% Mini-PROTEAN TGX gel (Bio-Rad, USA) and either visualised with Coomassie blue staining or transferred onto a nitrocellulose membrane for immunodetection using the c-Myc epitope tag. Western blotting, using a chemiluminescence detection method, was performed with a biotinylated mouse anti c-Myc [clone 9E10] antibody and Streptavidin:HRP (AbD Serotec, USA), both at a 1:5000 dilution. Total cell, culture medium and cytoplasmic fractions were similarly analysed under non-reducing conditions with Coomassie staining or c-Myc immunodetection. Total cell fractions were prepared by resuspending and boiling harvested cells in non-reducing SDS sample buffer. Culture medium fractions were prepared by concentrating proteins from the culture supernatant after cell harvesting with trichloroacetic acid followed by boiling in non-reducing SDS sample buffer. Cytoplasmic fractions were prepared by boiling spheroblasts (cells after periplasmic extraction) in non-reducing SDS sample buffer. Non-reducing conditions were used as the recombinant BabA proteins do not form intermolecular disulphide bonds, although they contain eight cysteine residues.

Protein purification
An osmotic shock method was used to obtain periplasmic extracts. Briefly, harvested cells were resuspended and gently shaken in a tenth their culture volume of spheroblast solution (20% sucrose, 1 mM EDTA, 50 mM Tris-HCl, pH = 8.0) on ice for one hour. Cells were then pelleted, resuspended and gently shaken again in a tenth of their culture volume of 5 mM MgSO 4 on ice for one hour and pelleted once more. Solutions used for cell resuspension contained complete EDTA-free protease inhibitor cocktail tablets (Roche, USA), according to the manufacturer's instructions. The supernatant fractions obtained at both steps represent the periplasmic extract and were subsequently used for recombinant protein purification.
The recombinant BabA proteins were purified from the periplasmic extracts using their 6ÂHis tags through immobilised metal ion affinity chromatography with EDTA-resistant Ni Sepharose excel resin (GE Healthcare, USA). Using an ÄKTApurifier system (GE Healthcare, USA), BabA 547 and BabA 547 -6K proteins were eluted with a step gradient of imidazole (up to 500 mM) in buffer containing 20 mM Tris-HCl (pH = 7.4), 300 mM NaCl. As a final purification step, size exclusion chromatography using a HiLoad 16/60 Superdex 75 gel filtration column (GE Healthcare, USA) was performed. The final storage buffer was 20 mM Tris-HCl (pH = 7.4), 300 mM NaCl. Protein concentration for periplasmic extracts and immobilised metal ion affinity chromatography elutions was determined with a Bradford assay reagent (Bio-rad, USA), using BSA as a standard [24]. Image analysis using ImageJ (http://imagej.nih.gov/ij/) was used to verify the purity of proteins on Coomassie stained SDS-PAGE gels. Western blotting, as described above, to detect the c-Myc epitope tag was also employed. The concentration of purified BabA was determined by measuring absorbance at 280 nm using an extinction coefficient of 37290 M À1 cm À1 (calculated from the amino acid sequences of BabA 547 -6K) on a Nanodrop 2000 (Thermo Scientific, USA).

Mass spectrometry analysis
Liquid chromatography-time-of-flight mass spectrometry was used to determine the molecular weights of proteins in purified BabA 547 and BabA 547 -6K protein samples. Five micrograms of protein samples were loaded onto an Agilent 1100 Series LC (Agilent Technologies, USA) coupled to a time-of-flight Q-Tof Premier mass spectrometer (Waters, USA), which was equipped with an electron spray ioniser for acquisition in a positive ionisation mode. The mass scan range was set to 500-2000 m/z and MassLynx (Waters, USA) was used to analyse the mass data. N-terminal sequencing was used for determination of the first five amino acids of the pure BabA 547 -6K protein through Edman degradation (AltaBiosciences, Birmingham).
Tryptic peptide mapping was performed on gel slices of purified BabA protein bands after size exclusion chromatography. Briefly, bands were reduced with 10 mM DTT for 45 min at 65°C, followed by alkylation of protein thiols by incubation with 50 mM iodoacetamide in the dark at room temperature for 20 min. Digestion of the proteins was performed using 250 ng of trypsin for 16 h at 37°C. Resulting fragments were analysed by liquid chromatographyquadrupole time-of-flight mass spectrometry. An UltiMate 3000 Nano LC system (Dionex, USA) coupled to a QSTAR Elite System (AB Sciex, USA) -a hybrid quadrupole time-of-flight mass spectrometer fitted with electrospray ionisation for acquisition in a positive ionisation mode -was used. Mascot (Matrix Science, USA) was used to analyse data.

Protein expression
To improve the expression and purification of the H. pylori outer membrane protein BabA, we introduced a C-terminal hexa-lysine tag to a modified pOPE101 expression vector designed to secrete BabA proteins into the E. coli periplasmic space. Expression conditions of our BabA constructs, shown in Fig. 1, were kept identical in order to observe any effects of the hexa-lysine tag. Similar polycationic amino acid tags have been reported to improve the expression and solubility of other proteins [19][20][21][22].
Following induction with IPTG and overnight incubation, periplasmic proteins were extracted from the E. coli cells based on a well-established osmotic shock procedure in the presence of protease inhibitors [23]. Periplasmic proteins were separated via SDS-PAGE and visualised with both Coomassie staining and c-Myc immunodetection (Fig. 2).
As shown in Fig. 2, it is clear that there are higher levels of c-Myc tagged recombinant protein in the periplasmic extracts, upon the addition of the hexa-lysine tag to the C-terminus of BabA 547 . We further investigated whether the hexa-lysine tag increased the total amount of protein expressed or just the amount of protein secreted to the periplasmic space. Total cell, culture medium and spheroblast (E. coli cells with an extracted periplasmic space) fractions were similarly analysed via Coomassie staining and Western blotting. We found the presence of both BabA 547 and BabA 547 -6K proteins is completely limited to the periplasmic space ( Supporting Information, Figs. A.1 and A.2). The difference in expression levels can also be seen in total cell fractions, suggesting the hexa-lysine tag has an influence on total expressed protein levels.

Protein purification
Recombinant BabA proteins were first purified from E. coli periplasmic extracts by immobilised metal ion affinity chromatography (IMAC); the purity of both recombinant proteins was relatively low after this procedure, as ascertained by Coomassie staining (Supporting Information, Fig. A.3). A major contaminant with a molecular weight between 37 and 50 kDa was also consistently observed in BabA 547 IMAC-purified fractions, but not in BabA 547 -6K IMAC-purified fractions. Furthermore, as shown in Table 2, the yield of BabA 547 was significantly lower after IMAC purification, compared to BabA 547 -6K. This poor recovery following IMAC purification is attributed to aggregate formation observed after protein elution with imidazole. These aggregates were separated via centrifugation and excluded from protein quantification and further purification steps.
In both cases, size exclusion chromatography (SEC) was used as a final purification step. As shown in Fig. 3, BabA 547 -6K IMACpurified proteins eluted predominantly as a monomer off the gel filtration column at elution volume (V e ) $60-70 mL. On the other hand, BabA 547 IMAC-purified proteins were found to have different characteristics in that elution began in the void volume (V e 6 46.5 mL), suggesting the presence of aggregates. Peak fractions (V e = 60-70 mL) were collected for both proteins for further characterisation. Fig. 1. Schematic diagram of (A) pOPE101-BabA 547 and (B) pOPE101-BabA 547 -6K expression cassettes. Inserts were cloned into the pOPE101 vector using NcoI and BamHI restriction enzyme sites. The abbreviations and symbols represent the following: Pro -synthetic lac promoter (P/A1/04/03); lacO -lac operator; arrow -AUG start codon/ transcriptional start; pelB -cleavable pelB leader sequence; BabA 547 -BabA gene fragment; 6x Lys -hexa-lysine peptide tag; c-Myc -Myc 9E10 epitope tag; 6x His -hexahistidine peptide tag; X/STOP -UAA stop codon/transcriptional termination.

Mass spectrometry analysis
Mass spectrometry was used to verify the molecular weight of the purified recombinant BabA proteins. Surprisingly, the observed molecular weights of both BabA 547 and BabA 547 -6K were 1014 Da smaller than their expected molecular weight (Figs. 4B and 5B). This was a repeated finding after purification and corresponds to a loss of the first nine amino acids of the expected BabA proteins -Edman degradation was used to confirm this (data not shown). We speculate that this N-terminal truncation occurs during pelB leader cleavage, upon periplasmic secretion, but further experimentation is needed to confirm this.
Mass spectrometry was also used, alongside Coomassie staining of SDS-PAGE gels containing purified recombinant BabA proteins, to assess final purity. For BabA 547 -6K, a single protein of high purity was found (Fig. 4). On the other hand, the Coomassie stained gel of purified BabA 547 contained two protein bands (Fig. 5A), which match the two prevalent proteins found after IMAC purification ( Supporting Information, Fig. A.3). The larger protein band could be detected using monoclonal antibodies directed to C-terminal c-Myc epitope tag, but this was not the case with the lower protein    band (data not shown). Mass spectrometry confirmed this lower band actually consisted of four lower molecular weight proteins, differing only in size by a few amino acids (Fig. 5B).
As this is a characteristic of protein degradation, the additional lower band was excised and subjected to tryptic digestion for peptide mapping. This analysis confirmed that the lower band consisted of peptide fragments of BabA 547 (45% sequence coverage; Fig. 6). Moreover, this truncation appears to occur at the C-terminus of the BabA 547 protein as no peptide masses after K383 were matched and, as previously mentioned, the C-terminal c-Myc epitope tag could not be detected. As these degradative fragments would consequently lack their C-terminal 6xHis tag, their appearance in IMAC purification fractions are probably the result of non-specific binding to the Ni sepharose excel resin, particularly given no imidazole is used in equilibration and sample buffers [25]. Because no degradative fragments were found with the BabA 547 -6K protein following purification, we suggest the hexa-lysine tag may also prevent proteolytic degradation in addition to enhancing solubility.

Binding activity of recombinant BabA
Finally, we assayed the interaction between recombinant BabA and the Lewis b oligosaccharide. The binding activity of the BabA 547 protein could not be appropriately characterised due to poor protein yield and purity. However, BabA 547 -6K was studied in a sandwich ELISA format where Lewis b -HSA glycoconjugates were immobilised on Maxisorp plates and bound protein detected through antibodies directed against the C-terminal c-Myc tag. Unconjugated HSA and Lewis y -HSA glycoconjugates were also included to assess BabA 547 -6K binding specificity.
As shown in Fig. 7, the BabA 547 -6K protein retained its binding affinity towards Lewis b -HSA glycoconjugates in a concentration dependent manner. Additionally, this recombinant BabA protein does not display non-specific binding towards HSA or Lewis y -HSA glycoconjugates.

Discussion
A method to efficiently produce a recombinant form of BabA is desirable to support biophysical and structural studies. However, expression and purification of soluble BabA in our lab proved to be very challenging. A recent report described successful expression of an active protein in the periplasmic space of E. coli XL10 Gold cells using a pOPE101 vector [14]. Based on this expression strategy, we show here that a C-terminal hexa-lysine tag enhances the expression and purification of a recombinant BabA protein that contains amino acids 10-547 of the mature, native protein from strain J99.
We initially hypothesised that a hexa-lysine tag would improve the solubility of BabA 547 and, as such, reduce the amount of aggregates observed after IMAC purification. Although this effect was noticed, we also found a significant increase in expressed BabA 547 -6K levels in the periplasmic space. The improved expression was due to an increase in the total amount of recombinant BabA rather than an increase in the ratio of protein found in the periplasmic space to other cellular compartments or secreted fractions.
The hexa-lysine tag appears to protect against proteolytic degradation resulting in an increased yield of BabA as a considerable amount of degradative fragments are found in BabA 547 , but not BabA 547 -6K, purified fractions. The mechanism or location of protein degradation is uncertain, but BabA 547 digestion may very well occur during expression. Indeed, the periplasmic space of E. coli is known to host at least 20 proteases with various functionalities [26]. The protease DegP is known to degrade recombinant proteins that are misfolded upon overproduction in order to prevent aggregate formation [27]. Another periplasmic protease, Tsp, is known to have a preference for unfolded substrates with hydrophobic amino acids at its C-terminus [28]. Further experiments are necessary to confirm whether BabA 547 , expressed in the E. coli periplasmic space, is a substrate for these, or other, periplasmic proteases. However, it is possible that the presence a C-terminal hexa-lysine tag interacts with other amino acids in the protein to enhance correct folding or is able to prevent protease access to susceptible cleavage sites. Firm conclusions cannot be drawn until BabA structural information becomes available.   BabA to immobilised HSA, Lewis y -HSA and Lewis b -HSA glycoconjugates was determined using a sandwich ELISA (n = 3, error bars represent the standard error of the mean). ns = not significant (P > 0.05), paired two-tailed t-test.
Using an ELISA, it was confirmed that BabA 547 -6K possessed binding activity towards Lewis b -HSA glycoconjugates. Furthermore, comparable to the reported activity of the native BabA protein [13], no non-specific binding to the related Lewis y -HSA glycoconjugates was found. These findings suggest that our recombinant protein is correctly folded during expression and may provide useful insight into the BabA adhesin through structural and biophysical studies, which are currently being pursued by our laboratory. Further study is also underway to determine if this recombinant expression strategy is applicable to different BabA truncates and other bacterial adhesins.

Conclusion
This work shows that the presence of a C-terminal hexa-lysine tag sequence greatly improves the total yield of a recombinant BabA protein expressed in the periplasmic space of E. coli. Once purified, the recombinant protein retains its binding affinity towards Lewis b antigens and can therefore be used for further biophysical and structural characterisation to gain insight into the BabA adhesin.