Refinement of the Fusion Tag PagP for Effective Formation of Inclusion Bodies in Escherichia coli

ABSTRACT Methods for efficient insoluble protein production require further exploration. PagP, an Escherichia coli outer membrane protein with high β-sheet content, could function as an efficient fusion partner for inclusion body-targeted expression of recombinant peptides. The primary structure of a given polypeptide determines to a large extent its propensity to aggregate. Herein, aggregation “hot spots” (HSs) in PagP were analyzed using the web-based software AGGRESCAN, leading to identification of a C-terminal region harboring numerous HSs. Moreover, a proline-rich region was found in the β-strands. Substitution of these prolines by residues with high β-sheet propensity and hydrophobicity significantly improved its ability to form aggregates. Consequently, the absolute yields of recombinant antimicrobial peptides Magainin II, Metchnikowin, and Andropin were increased significantly when expressed in fusion with this refined version of PagP. We describe separation of recombinant target proteins expressed in inclusion bodies fused with the tag. An artificial NHT linker peptide with three motifs was implemented for separation and purification of authentic recombinant antimicrobial peptides. IMPORTANCE Fusion tag-induced formation of inclusion bodies provides a powerful means to express unstructured or toxic proteins. For a given fusion tag, how to enhance the formation of inclusion bodies remains to be explored. Our study illustrated that the aggregation HSs in a fusion tag played important roles in mediating its insoluble expression. Efficient production of inclusion bodies could also be implemented by refining its primary structure to form a more stable β-sheet with higher hydrophobicity. This study provides a promising method for improvement of the insoluble expression of recombinant proteins.

partially folded or misfolded intermediates can be rapidly degraded in vivo. Thus, their accumulation can be improved significantly by shifting the equilibrium toward the formation of IBs to favor protection from proteolytic degradation. For a given protein, it is often difficult to assess which insoluble fusion partner is the most effective at targeting it into IBs. Nevertheless, some insoluble fusion partners have been demonstrated to work well with multiple target peptides or proteins, such as ketosteroid isomerase (KSI), EDDIE, and PagP (1).
When overexpressed in Escherichia coli, fungal prion HET-s accumulated predominantly in IBs with high b-sheet content and displayed very similar characteristics to amyloid proteins. The kinetics of HET-s amyloid fibril formation demonstrated that amyloid growth was nucleation dependent (3). Accumulating evidence has shown that IBs are highly ordered aggregates that bear a characteristic cross b structure similar to amyloid fibers. As occurs for amyloids, formation of IBs is promoted by intermolecular interactions in a nucleation-dependent manner through hydrophobic protein patches (4). Mutation of Pro-102 or Pro-105 to leucine in human prion (PrP) can greatly promote the formation of PrP aggregates since proline residues normally disfavor the formation of b-sheet conformation (5). IBs have been recognized as a valuable model to understand protein aggregation in eukaryotes and search for specific inhibitors or disaggregation approaches (6).
As important effectors of the innate immune system in multicellular organisms, antimicrobial peptides (AMPs) protect their hosts against a large variety of invading pathogens. Besides their antimicrobial activities, some AMPs are also recognized for their immunomodulatory properties (7). Recombinant expression of AMPs in E. coli faces two challenges; AMPs, rich in basic amino acid residues, are highly susceptible to proteolytic degradation due to their unordered structure, and they can be toxic to the producing hosts. To overcome both obstacles, a commonly used strategy is to attach them to fusion tags (8,9). Targeting AMPs to IBs is believed to be more effective than their soluble fusion expression for masking their toxic effects and protecting them from proteolytic degradation (1,10). For a given AMP, selecting a suitable fusion tag to confer efficient expression in IBs remains empirical or needs to be screened. Additionally, the repertoire of IB-targeted fusion tags also needs to be expanded.
Development of effective methods for recombinant production and purification of AMPs is of practical significance (11). In the present work, aggregation "hot spots" (HSs) of the fusion tag PagP were analyzed. Based on structural analysis, we demonstrated that the C-terminal region of PagP contains numerous HSs and could function as an effective fusion tag to target AMPs to IBs. Furthermore, our data also showed that mutation of proline residues in or near the aggregation HSs to hydrophobic residues could significantly improve its potency as an IB-targeting fusion tag. Fewer examples have been presented to demonstrate the separation of recombinant target proteins expressed in IBs fused with a tag. Herein, we describe an alternative approach to recover authentic AMPs expressed as IBs in fusion with PagP or its refined version.

RESULTS
Targeting recombinant AMPs to IBs using the PagP fusion tag. Unfavorable patterns of codon usage can affect high-level expression of recombinant proteins (12). The Metch, Andropin, and Mag II AMP genes were designed with a codon pattern adapted to the codon usage bias of E. coli using the web platform at http://genomes .urv.es/OPTIMIZER (13). Meanwhile, an artificial short peptide (NHT) with three functional motifs (motifs I to III) was designed and appended at the N terminus of AMPs (Fig. 1A). Motif I (ASRHWMAG) allows the fusion proteins to be hydrolyzed site-specifically by Ni 21 ions; motif II (HHHHHH) allows the fusion AMPs (NHT-Metch, NHT-Andropin, and NHT-Mag II) to be purified by Ni-chelating chromatography; and motif III (ENLYFQ) allows the fusion AMPs to be cleaved site-specifically by tobacco etch virus (TEV) protease to release authentic Metch, Andropin and Mag II.
The PagP accumulates in IBs when overexpressed in E. coli (1). As expected, the PagP-NHT-Metch fusion protein accumulated predominantly in IBs (Fig. 1). Insoluble fusions in IBs are refractory to site-specific cleavage by proteases such as thrombin and TEV. An optimized amino acid sequence has been developed for Ni (II)-catalyzed cleavage (14), and it has been demonstrated that nearly complete nickel-catalyzed hydrolysis of fusion proteins can be achieved under denaturing conditions (15). Full cleavage of the PagP-NHT-Metch fusion protein was achieved after 24 h of hydrolysis at 60°C, confirmed by the appearance of the PagP fusion tag ( Fig. 2A). Consequently, the released soluble His6-tagged passenger peptide NHT-Metch could be purified by Ni-affinity chromatography (Fig. 2B). Refinement of the PagP Tag to target recombinant peptides more efficiently to IBs. As an integral outer membrane protein, E. coli PagP exhibits a b-barrel architecture with a hydrophobic exterior facing the membrane bilayer and a hydrophilic interior tunnel (16). Many of these HSs have been characterized in proteins governing neurodegenerative and systemic amyloidogenic diseases (17). Based on the aggregationpropensity tendencies for natural amino acids derived from in vivo experiments, the web-based AGGRESCAN software has been developed for prediction of aggregationprone segments in protein sequences (18). To identify the putative aggregation HSs of  the PagP fusion tag, we analyzed its sequence with AGGRESCAN web tools, and seven HSs (HS1 to HS7) were identified (Fig. 3). We also found that these HSs were distributed unevenly along the amino acid sequence, with four HSs present in the C-terminal region comprising residues 101 to 161. This finding implied that the C-terminal region of PagP has great potential to aggregate. We reconstructed a series of fusion tags using the C-terminal region of PagP as the template. Peptide NHT-Metch was then separately fused with these constructed tags (Fig. 4). When targeted for expression in E. coli, PagP-1-NHT-Metch, PagP-2-NHT-Metch, and PagP-3-NHT-Metch accumulated in considerable amounts (Fig. 5A). However, there was no obvious accumulation of fusion proteins for tags PagP-4, PagP-5, and PagP-6 ( Fig. 5B), which possessed a denser distribution of aggregation HSs owing to addition of HS3 or a combination of HS3 with HS2 and HS1. Our results showed that a simple combination or addition of aggregation HSs to a given fusion tag might bring about the negative effect considering its ability to target passenger peptides to IBs. Compared with PagP-NHT-Metch, accumulation of PagP-1-NHT-Metch decreased in terms of overall quantity. However, the absolute yield of the recombinant peptide increased by ;40% due to the decreased molecular weight of the fusion tag PagP-3 (Fig. 5C). We then fused NHT-Andropin and NHT-Mag II with these tags. All fusion tags targeted NHT-Andropin to IBs effectively, and likewise for NHT-Metch, increasing the absolute yield of recombinant Andropin up to almost 50% ( Fig. 6B and B). However, these fusion tags lost their efficacy when fused with passenger NHT-Mag II, the accumulation of which was barely detected by SDS-PAGE (Fig. 6C). Our results suggest that the effectiveness of a given fusion tag for IB-targeted expression was significantly affected by the physicochemical properties of the passenger proteins.
Mutation of PagP for more efficient formation of IBs. Recombinant IBs in E. coli possess the characteristic cross-b structure of amyloid fibers (19). Changing the hydrophobicity or propensity to form a b-strand may affect the aggregation of a given protein when overexpressed. We analyzed the secondary structure of PagP and highlighted its C-terminal region, comprising four b-sheet elements ( Fig. 3) (16). We found that this region is rich in proline residues (P-121, P-123, P-127, and P-135), which are scarce in b-sheet structure. Compared with proline, isoleucine or leucine has a higher propensity to form the b-sheet (20). Moreover, isoleucine and leucine have bulky hydrophobic side chains that favor intermolecular and intramolecular interactions between side chains, facilitating the formation of aggregates. We replaced Pro with Leu or Ile to favor the formation of b-strands. A series of mutants (P127I-PagP-NHT-Mag II, P135L-PagP-NHT-Mag II, P121L/P123L-PagP-NHT-Mag II, P127I/P135L-PagP-NHT-Mag II, P121L/P123L/P135L-PagP-NHT-Mag II, and P121L/P123L/P127I/P135L-PagP-NHT-Mag II) were constructed. When overexpressed in E. coli, these mutants accumulated in greater quantities compared with PagP-NHT-Mag II. The yields of these recombinant mutants were increased

Fusion Tags Can Be Refined to Form Inclusion Bodies
Microbiology Spectrum by up to 44.3 to 60.5% (Fig. 7). These results demonstrated that the aggregation ability of a fusion tag could be further improved by increasing its overall hydrophobicity or enhancing its propensity to form a more stable b-sheet conformation. Nevertheless, the ability of PagP to aggregate was not further improved by increasing its overall hydrophobicity or propensity to form stable b-strands. The accumulation level of group I (P127I/P135L-PagP-NHT-Mag II, P121L/P123L/P135L-PagP-NHT-Mag II, and P121L/P123L/ P127I/P135L-PagP-NHT-Mag II mutants) decreased slightly in comparison with that of group II (P127I-PagP-NHT-Mag II, P135L-PagP-NHT-Mag II, or P121L/P123L-PagP-NHT-Mag II mutants; Fig. 7). We also constructed another series of fusion proteins for recombinant expression of AMPs, using solubility-enhancing tag Thioredoxin A (TrxA) and insolubility-targeting tag the histone fold domain (HFD) of the human transcription factor TAF12 (HFD-TAF) (21,22), respectively. As shown in Fig. 8A, the fusion protein TrxA-NHT-Metch and TrxA-NHT-Andropin could accumulate in considerable quantities, while TrxA-NHT-Mag II could not. Moreover, the tag HFD-TAF could not target the three antimicrobial peptides (Metch, Andropin, Mag II) to accumulate in insoluble aggregates. Compared with refined version of PagP (PagP-1, P127I-PagP), the tag TrxA and HFD-TAF target the AMPs (Metch, Andropin, Mag II) to recombinantly express much less efficiently (Fig. 8B). When being Purification of recombinant AMPs and analysis of biological activities. Fusion tags must normally be cleaved and removed due to their potential to interfere with the activities of passenger proteins. This is especially challenging in the case of IB-targeted expression since IBs are usually solubilized in harsh denaturants or detergents, precluding the utilization of enzymatic cleavage (for example, with TEV protease or thrombin). With the help of the artificially designed peptide NHT, recombinant AMPs in fusion with NHT could be conveniently purified by Ni-affinity chromatography after Ni 21 -catalyzed specific cleavage. Eventually, authentic AMPs were recovered following successive sitespecific cleavage by TEV protease and ion-exchange chromatography (Fig. 9). Mag II and Andropin exhibited biological activity against both Gram-positive and Gram-negative bacteria (23,24), while Metch showed inhibitory activity toward Gram-positive bacteria (25). We then tested antimicrobial activity using classical inhibition zone assays. As expected, the appearance of clear inhibition zones confirmed that the recombinant peptides possessed their native antimicrobial activities against both Gram-positive S. aureus and Gram-negative E. coli, or only against Gram-positive S. aureus in the case of Metch (Fig. 10) (Table 1).

DISCUSSION
Compared with solubility-enhancing tags, IB-targeting fusion partners are believed to be more effective at protecting passenger proteins from proteolytic degradation and masking their toxic activities. Using a given fusion tag as the template, there are few examples illustrating how to refine its sequence to increase the yields of passenger proteins. It has been proposed that the primary structure of a polypeptide intrinsically determines its propensity to aggregate (26). Moreover, some short specific amino acid stretches play crucial roles as the initial nucleators (HSs) during aggregation. The finding of unevenly distributed HSs at the C-terminal region of PagP prompted us to investigate whether this C-terminal region alone could retain high potency to direct passenger peptides to IBs. In fact, the increased absolute yields of passenger peptides implied that a larger number of molecules formed aggregates with truncated PagP, reinforcing its potency to form aggregates. A larger fusion tag implies a lower peptide-to-tag ratio, disfavoring the final yield of target peptides (8). Screening and identification of effective shorter fusion tags is therefore of practical significance. Nevertheless, we cannot ignore the fact that these versions of PagP lost their ability to efficiently target Mag II to IBs. Some IBs have been shown to have considerable biological activity, characterized by a loose arrangement of protein molecules (27). Mag II is a very potent AMP that exerts its biological activity by destroying the integrity of the cytoplasmic membrane (19). When fused with truncated PagP (PagP-1, PagP-2, or PagP-3), the resulting fusion proteins might retain partial antimicrobial activity or be unable to form stable aggregates immediately, making them more vulnerable to proteolytic degradation induced by multiple levels of cellular responses.
Bacterial IBs have been found to possess some amyloid-like properties and contain similar structures (4,6). Much effort has been made to understand the molecular mechanisms underlying amyloid cascades (28). Ab 42 is an amyloidogenic peptide in which a central hydrophobic stretch (residues 17 to 21) is predicted to be an aggregation HS

Fusion Tags Can Be Refined to Form Inclusion Bodies
Microbiology Spectrum (18). Decreasing hydrophobicity or b-sheet propensity in this stretch could significantly affect aggregation propensity and neurotoxicity (29,30). Besides, the presence of proline residues is believed to decrease the overall protein aggregation propensity (31). A hydrophobic region (residues 58 to 63) was identified in Microcin E492. Compared with wild-type MccE492, mutants P57A and P59A exhibited a greater tendency to form amyloid aggregates in vivo and aggregated significantly faster in vitro (32). PagP is an outer membrane protein rich in b-strands that readily accumulates in IBs when expressed in E. coli cytoplasm (16,33). We found that there are several proline residues (P121, P123, P127, and P135) inside or near the b8 strand (Fig. 3). Leu and Ile are more hydrophobic and have higher propensity to form b-strands than Pro (20). These findings prompted us to construct mutants P121L/P123L-PagP-NHT-Mag II,   (1). Partially folded forms are vulnerable to proteolytic degradation. These mutants might aggregate more rapidly, bringing about enhanced resistance to proteolytic degradation. This hypothesis is further evidenced by the scarce accumulation of recombinant fusion protein PagP-1-NHT-Mag II. Similar reasons can account for the unsuccessful expression of HFD-TAF fusions. The increased accumulation levels demonstrated that the aggregation capability of a given IB-targeted fusion tag could be improved significantly by properly enhanced hydrophobicity or propensity to form b-strands (Fig. 7). To our knowledge, few previous reports have demonstrated that a fusion tag can be successfully improved to form aggregates in E. coli. Many linear AMPs have an undefined structure in aqueous solution and are highly susceptible to proteolytic degradation during heterogeneous expression (34). Targeting AMPs to IBs is preferable to protect them from cellular proteases and mask their cellular toxicity (35). However, fusion tag removal is often necessary. The soluble fusion proteins are tractable to site-specific cleavage by some proteases such as thrombin or TEV protease (36). However, this strategy may not be a feasible option when fusion proteins exist in insoluble Ibs. In this setting, fusion tag removal is biochemically challenging. Few examples have illustrated how to separate target peptides from fusion proteins expressed in Ibs. A previous report has described a method for purification of the target proteins expressed in fusion with PagP using nickel ion-catalyzed peptide bond hydrolysis (37). Even so, this method is not applicable to purification of the short peptides. In this work, we designed a NHT linker sequence inserted between the fusion tag and AMPs (Fig. 1). This artificial peptide possesses three functional motifs, allowing specific Ni (II)-catalyzed cleavage, Ni-chelating affinity chromatography, and TEV protease-mediated specific cleavage to be performed successfully. This approach is particularly applicable to production of short unstructured peptides. We believe that the approach presented herein will promote IB-targeted expression of short peptides.

MATERIALS AND METHODS
Bacterial strains and plasmid construction. The E. coli DH5a strain was used as a host for gene cloning and preparation of plasmids, while E. coli BL21(DE3) was used as a host strain for expression of recombinant proteins. All E. coli strains were cultured in LB medium supplemented with antibiotics as needed.
Antimicrobial peptides magainin II (Mag II) (23), Metchnikowin (Metch) (25) and Andropin (24) were selected as test cases for this study. Mag II, Metch, and Andropin genes were synthesized by overlap extension PCR according to their amino acid sequences, with a codon pattern adapted to the usage bias of E. coli (13,38). The Mag II gene was first synthesized by overlap extension PCR with primers 1 to 4, then sequentially amplified with primers 8 and 4, primers 9 and 4, and primers 10 and 4. The resulting amplicon encoded a Mag II fusion peptide with a sequence (ASRHWMAG) for Ni(II)-dependent peptide bond hydrolysis (39), a His 6 tag (HHHHHH) and a recognition site (ENLYFQ) for the site-specific protease tobacco etch virus (TEV) (40) at its N terminus. The resulting sequence ASRHWMAGHHHHHHENLYFQ was denoted NHT.
The Metch gene was also first synthesized by overlap extension PCR with primers 5 to 7, then sequentially amplified with primers 8 and 7, primers 9 and 7, and primers 10 and 7. The resulting amplicon encoded an NHT-Metch fusion peptide.
Similarly, the Andropin fusion gene was first synthesized by overlap extension PCR with primers 11 to 15, then sequentially amplified with primers 8 and 15, primers 9 and primer 15, and primers 10 and 15. The Andropin fusion gene was finally constructed, encoding the NHT-Andropin fusion peptide. These fusion AMPs are outlined in Fig. S1 in the supplemental material. The PagP gene (Gene ID:  Table S1. PagP is a Gram-negative bacterial outer membrane protein and extremely prone to accumulating in IBs when overexpressed in E. coli (1). Based on analysis of the PagP sequence using the web platform AGGRESCAN at http://bioinf.uab.es/aggrescan/ (18), seven aggregation HSs were identified and denoted HS1 to HS7. The C-terminal region of PagP (residues 101 to 161) was denoted PagP-1, comprising four aggregation HSs (HS4 to HS7; Fig. 3 and 4).
All constructs were checked by DNA sequencing, and primer sequences are listed in Supplementary data Table S1.
Expression of fusion proteins. E. coli BL21(DE3) cells harboring expression vectors were cultured overnight at 37°C in 6 mL of LB medium. Cultures were then diluted 100-fold and cultured at 37°C until they reached mid-log phase (optical density at 600 nm [OD 600 ] ;0.6 to 0.8). Expression of fusion proteins was induced by adding isopropyl b-D-1-thiogalactopy-ranoside (IPTG) to a final concentration of 0.3 mM, and cultures were further incubated for 12 h at 37°C. A 100-mL sample of culture was centrifuged for 10 min at 6,000 rpm and 4°C to harvest bacteria. The E. coli cells were resuspended in 10 mL lysis buffer (50 mM NaH 2 PO 4 -Na 2 HPO 4 , 0.2 M NaCl, 20 mM imidazole, pH 8.0), then lysed by sonication on ice. IBs were isolated by centrifugation for 15 min at 12,000 rpm at 4°C, washed twice with washing buffer I (20 mM Tris-HCl, 50 mM NaCl, 0.1% Triton X-100, 5 mM EDTA, pH 8.0), then with washing buffer II (20 mM Tris-HCl, 50 mM NaCl, pH 8.0). Extracted IBs and whole-cell lysates were subjected to analysis by SDS-PAGE, and gel images were further analyzed using Image Lab software (BIO-RAD) to evaluate expression levels of fusion proteins.
Specific hydrolysis of fusion proteins by Ni (II) ions. Specific hydrolysis of fusion proteins by Ni (II) ions was conducted under denaturing conditions according to a previous report (15). Isolated IBs were dissolved in hydrolysis reaction buffer (20 mM HEPES, 6 M GuHCl, pH 8.2) at a concentration of ;200 mM, and fusion proteins were subjected to hydrolysis by addition of NiSO 4 to a final concentration of 5 mM, and incubated for 12, 24, or 36 h at 60°C, to investigate the hydrolysis process. The mixture was diluted 5-fold with addition of lysis buffer (50 mM NaH 2 PO 4 -Na 2 HPO 4 , 0.3 M NaCl, 20 mM imidazole, pH 8.0), and centrifuged for 15 min at 12,000 rpm and 4°C to remove the precipitated fusion tag.
Purification of recombinant AMPs. After being specifically hydrolyzed by Ni 21 ions, fusion proteins were split into a tag and the NHT-AMP peptide, which was further purified by Ni-chelating affinity Fusion Tags Can Be Refined to Form Inclusion Bodies Microbiology Spectrum chromatography according to the protocol specified by the manufacturer (GE Healthcare Bio-Sciences). First, the column was equilibrated with lysis buffer (50 mM NaH 2 PO 4 -Na 2 HPO 4 , 0.3 M NaCl, 20 mM imidazole, pH 8.0), and the protein sample was loaded. The loaded column was washed three times with washing buffer III (50 mM NaH 2 PO 4 -Na 2 HPO 4 , 0.3 M NaCl, 40 mM imidazole, pH 8.0), and the recombinant NHT fusion (NHT-Metch, NHT-Mag or NHT-Andropin) was eluted with elution buffer I (50 mM Na 2 HPO 4 -NaH 2 PO 4 , 250 mM imidazole, pH 7.0). The fusion AMP (NHT-Metch, NHT-Mag or NHT-Andropin) in elution buffer I was directly subjected to specific cleavage by addition of TEV protease for 12 h at 25°C according to a previous report (40). The released AMP (Metch, Mag II or Andropin) was further purified using ion-exchange chromatography using Macro-Prep CM Resin (Bio-Rad). The column was first equilibrated with two column volumes of equilibration buffer (200 mM imidazole, 50 mM NaH 2 PO 4 -Na 2 HPO 4 , pH 7.0). A 5-mL sample of protein was loaded onto the column, and recombinant AMPs were eluted with elution buffer II (500 mM NaCl, 50 mM Na 2 HPO 4 -NaH 2 PO 4 , pH 8.0).
Microbicidal activity assay of recombinant AMPs. The antimicrobial activities of recombinant AMPs were analyzed by inhibition zone assay according to the protocol in a previous report (42). Briefly, Gram-positive Staphylococcus aureus ATCC 25923 and Gram-negative E. coli strain K 12 D 31 were grown overnight at 37°C in LB medium. A 50-mL sample of culture was inoculated into 50 mL of fresh LB medium and incubated for an additional 2 to 3 h at 37°C to OD 600 ;0.5. A 200-mL sample of cell suspension was inoculated into 50 mL of prewarmed (45°C) LB medium containing 0.8% (wt/vol) agar and rapidly dispersed. The medium was then poured into a petri dish (9 cm diameter) to form a uniform layer to a depth of ;1.5 mm. Holes with a diameter of 2 mm were punched into the gelated medium. For microbicidal activity assay, recombinant AMPs were added into the punched holes, and the plate was incubated for 12 h at 37°C to assess the appearance of inhibition zones.
Data availability. The original contributions presented in this study are included in the article/ Supplemental Material. Further inquiries can be directed to the corresponding author.