A systematic investigation of the stability of green fluorescent protein fusion proteins*

X-ray crystallography provides important insights into structure-function relationship in biomolecules. However, protein crystals are usually hard to obtain which hin- ders our understanding of multiple important processes. Crystallization requires large amount of protein sample, whereas recombinant proteins are often unstable or insoluble. Green fluorescent protein (GFP) fusion is one of the approaches to increase protein synthesis, solubility and stability, facilitating crystallization. In this study we analyze the influence of the linker length, composition and the position of GFP relative to the fusion partner on the fusion protein production and stability. To this end, multiple constructs of enzymatically impaired variant of PemK Sa toxin from Staphylococcus aureus CH91 fused to GFP were generated. Fusion protein production in Escherichia coli was evaluated. The proteins were purified and their stability tested. PemK Sa -α14aa-GFP fusion provided best production and stability. Obtained results demonstrate the importance of optimization of fusion protein construct, including linker selection and the or- der of fusion partners, in obtaining high quantities of stable protein for crystallization.


INTRODUCTION
Protein crystallization followed by X-ray diffraction studies allows to elucidate the three dimensional structures, providing vital information on structure-activity relationships. Understanding the molecular details of the mechanism of action of different proteins provides important progress in basic research and multiple other fields including drug discovery (Overington et al., 2006). However, crystallization requires a considerable amount of purified, well behaved, sample, which is often hard to obtain. Fusion proteins are one of the attractive tools which frequently allow to overcome this limitation (Smyth et al., 2003). Fusion proteins not only facilitate purification, but are also used to improve production, solubility and stability of target proteins. Protein tags are attached either at N-or C-terminus of the fusion partners or less often replace intrinsic unstructured regions (Cherezov et al., 2007). The importance of fusion proteins in protein crystallization is not limited to facilitating protein synthesis only. By increasing the available polar surface fusion proteins may facilitate protein crystals growth (Cherezov et al., 2007).
Commonly used protein tags include maltose-binding protein (MBP) (Kobe et al., 1999), glutathione-S-transferase (GST) (Lally et al., 1998), thioredoxin A (Corsini et al., 2008), antibody fragments (Iwata et al., 1995) and lysozyme (Cherezov et al., 2007), but other tags have also been used. These tags helped obtaining a large number of crystal structures. Yet another possibility is provided by using green fluorescent protein (GFP) as a fusion tag. GFP originates from a jellyfish Aequorea victoria (Cubitt et al., 1995). Due to a well-defined, compact structure (Yang et al., 1996) and stability in varied conditions, including broad pH range, elevated temperature and the presence of detergents (Cubitt et al., 1995) it constitutes an ideal fusion partner. It is also well soluble. Though GFP has a relatively large molecular weight (~27 kDa) it usually does not interfere with the function of fused proteins (Kwolik et al., 2001;Hsieh et al., 2010). The above characteristics have warranted the success of GFP in studying in vivo tracking of proteins (Hsieh et al., 2010;Ashikawa et al., 2011) and a fact that GFP serves as a suitable scaffold tag in protein crystallization (Suzuki et al., 2010;Mueller et al., 2013;Nguyen et al., 2013). The position of the tag relative to a fusion partner and the properties of a peptide linker are of considerable importance for successful production and stability (Arai et al., 2001;Japrung et al., 2005). Nevertheless, insufficient and partly contradictory experimental data does not support rational construct design. Some reports suggest that localization of the tag at the N-terminus of the fusion partner warrants high production and proper folding (di Guan et al., 1988). Another studies argue that fusion at the C-termini of the target protein improves solubility (Japrung et al., 2005). Certain data indicates that the use of flexible linker sustains functionality of the fusion partner (Robinson et al., 1998), but another study demonstrated that only a rigid, alpha-helical linker provides enough spatial separation of fusion partners to facilitate independent folding (Arai et al., 2001). Linker length is also of importance, but insufficient data is available to conclude on the best length and composition (Kwolik et al., 2001;Corsini et al., 2008).
Clearly, the controversy concerning construct design is partly related to the fact that different fusion partners may require different strategies to obtain efficient production. Nevertheless, we believe that a systematic investigation may define certain general recommendations for rational construct design. Therefore, in this study we designed a set of fusion proteins containing GFP and an enzymatically impaired staphylococcal mRNA interferase PemK Sa (R84A). The interferase, belonging to the toxin-antitoxin system encoded on pCH91 plasmid of Staphylococcus aureus CH91 (Takeuchi et al., 1999;Bukowski et al., 2013), was chosen due to problematic production and stability of his-tagged constructs (unpublished results). Synthesis level, solubility and stability were evaluated for N-and C-terminal GFP fusions containing different linker lengths. The results allow to formulate certain general recommendations for GFP fusion containing construct design.

MATERIALS AND METHODS
Bacterial strains and culture conditions. E. coli TOP10 (Invitrogen) and BL21(DE3) (Novagen) were used for cloning and fusion protein production, respectively.
Protein production and purification. Liquid overnight cultures were diluted 100-fold in 500 ml of Tryptic Soy Broth (Sigma Aldrich) containing ampicilin (0.1 mg/ml). The bacteria were cultured at 37°C with thorough aeration until the optical density measured at 600 nm reached 0.6. Protein production was induced with IPTG (isopropyl β-d-1-thiogalactopyranoside) at a final concentration of 1 mM. After induction, the cultures were incubated overnight at 20°C with thorough aeration. Cells were harvested by centrifugation for 30 min at 5 000×g and lysed by sonication. Lysates were clarified for 30 min at 21 000×g. The fusion proteins were purified by affinity chromatography in native conditions using buffers recommended by the resin supplier (nickelnitrilotriacetic acid, NiNTA; Novagen). The preparation was dialyzed overnight against 5 mM Tris/HCl, 50 mM NaCl, pH 7.4, precipitation was assessed, and the fusion protein was further purified by gel filtration using Superdex 75 column (GE Healthcare) in the same buffer.
Determination of stability of fusion proteins. The samples were incubated at 4°C and 22°C for two weeks. The stability and the amount of remaining fusion protein was analyzed by SDS-PAGE and densitometry. Intensity profiles were generated and quantified using ImageJ software (Schneider et al., 2012).

Evaluation of fusion protein production and solubility
We have previously described the toxin-antitoxin system PemIK Sa of S. aureus strain CH91 (Bukowski et al., 2013), however, our attempts to obtain recombinant PemK Sa or its mutants at crystallization quality have PemK Sa -15aa-GFP failed due to instability of the toxin, especially in low ionic strength buffers. In this study we ventured to obtain stable GFP fusion of PemK Sa (R84A) (further denoted as PemK Sa ) for crystallization and in parallel to acquire a systematic evaluation of the effect of GFP position relative to the fusion protein and the linker length on production and stability. Six constructs were prepared, three containing GFP at the N-terminus of PemK Sa and three at the C-terminus. In both types of constructs three linkers were evaluated: a short two amino acid containing one, a flexible 15 amino acid linker and a rigid α-helical 14 amino acid containing one (Fig. 1). We analyzed the synthesis level and solubility (vs. inclusion bodies) of each construct in E. coli. The efficiency of recombinant protein production and the amount of protein in soluble fraction differed between evaluated constructs ( Fig. 2 and Table 3.). Highest production was obtained for PemK Sa -α14aa-GFP whereas production level of GFP-2aa-PemK Sa was lowest among all tested constructs. The influence of the linker type on synthesis level was different for the constructs containing GFP at the C-terminus and those containing GFP at the N-terminus. In the former group, the highest production was obtained using the α-helical linker while in the latter of the flexible one. Concerning solubility, no pronounced differences were observed between C-terminal fusions and those having GFP attached at the N-terminus. This does not corresponds to the previously published data suggesting that N-terminal fusion tags exhibit better synthesis and solubility (di Guan et al., 1988). We observed no effect of the linker type on solubility of C-terminal GFP fusions, whereas among the N-terminal fusion constructs α-helical linker resulted in increased production of the fusion protein in the soluble fraction.

Assessment of fusion protein stability
To compare the stability and other properties of constructed fusion proteins each was purified using immobilized metal-affinity chromatography (IMAC) and gel filtration. First, we determined how the relative sequence of GFP and PemK Sa affects the efficiency of purification. For all constructs, an equal volume of the NiNTA was overloaded with the fusion protein. Overloading was evidenced by the presence of the fusion protein in the flow-through. After thorough washing, the proteins were eluted and the amount of fusion protein was quantified. In all tested cases higher amount of fusion protein was obtained for constructs containing GFP at the C-terminus ( Fig. 3 and Table 3), indicating more efficient binding to the resin compared to constructs containing GFP at the N-terminus. We speculate that this effect may be related to steric constraints. In the configuration where the tag is linked to a smaller partner (PemK Sa ; ~12 kDa) more protein can be packed close to the resin than in the case when a larger partner (GFP; ~27 kDa) is direct-  PemK Sa toxin and GFP were combined in two different relative orientations. In each orientation the fusion proteins were connected with three linker types (six different constructs were evaluated in total). A six histidine tag was added at the N-terminus of each fusion protein. ly linked to the tag. This was however not further tested systematically. Samples obtained after IMAC were dialyzed against low ionic strength buffer -one most suitable for further crystallization. It is important to note that our previous experience with His-tagged PemK Sa production demonstrated its instability in low ionic strength buffers, resulting in almost complete protein precipitation. Largely different recovery was observed between the analyzed constructs ( Fig. 3 and Table 3). PemK Sa -15aa-GFP was most resistant to low ionic strength and only 20% of the initial sample was lost during dialysis due to precipitation. PemK Sa -2aa-GFP was most unstable in these conditions and over 90% of the initial sample was lost due to precipitation. The tested constructs containing GFP at the N-terminus of the fusion partner were relatively stable regardless the type of the linker. This demonstrates that the stability of the fusion protein in tested conditions is determined by a synergic effect of the relative position of the fusion partners and linker design.
Following dialysis, the multimeric state of obtained proteins was evaluated by gel filtration. Since gel filtration is routinely used as a polishing step in protein preparation we also investigated the recovery of particular constructs. Equal amounts of each fusion protein were separated on Superdex 75 equilibrated with low ionic strength buffer identical to that used for dialysis. The elution profiles differed significantly for N-and C-terminal fusions (Fig. 4). All fusion proteins containing GFP at their C-terminus eluted as a single peak corresponding to a fusion protein dimer. Most probably dimerization occurred through the GFP component, since PemK Sa was previously demonstrated in monomeric form in similar conditions (Bukowski et al., 2013). Indeed, the crystal structure of GFP clearly demonstrates dimerization (Yang et al., 1996). All the samples containing GFP linked at the N-terminus of PemK Sa presented a more heterogeneous gel filtration profile containing more than one protein peak. The two major peaks correspond to protein dimer and a monomer, but smaller peaks are additionally present which also contain the analyzed fusion protein, but the physicochemical character of which remains unknown. Clearly, dimerization is hindered in the N-terminal fusions, most probably due to a steric obstruction by the fusion partner being linked to the C-terminus of GFP, which part mediates dimerization (Yang et al., 1996). This explanation is corroborated   by the fact that the fusion protein containing the long flexible linker presents much higher fraction of the dimer compared to the two other evaluated fusions. Interestingly, separation of GFP and PemK Sa with the flexible linker resulted in best recovery regardless the relative order of the evaluated fusion partners. Protein crystallization usually involves prolonged incubation either at room temperature or at 4°C. To test the stability of obtained fusion proteins each was incubated in a buffer containing 5 mM Tris/HCl and 50 mM NaCl for two weeks. All tested variants of the fusion protein were stable for at least two weeks at 4°C. At room temperature (~22°C), however, significant degradation of all fusion proteins, save PemK Sa -α14aa-GFP, was observed. Linker dependent differences in degradation pattern were noted. Proteins containing the short or the α-helical linker degraded slowly and without any stable intermediates indicating multiple sites of hydrolysis. In contrast, the proteins containing the flexible linker degraded much faster, but hydrolysis was observed only within the linker and PemK Sa while GFP remained stable during the time of the experiment (Fig. 5).
Concluding, our study demonstrates the influence of relative position of GFP and the fusion partner as well as of the linker length and design on production yield, soluble fraction content, efficiency of affinity purification and stability both during dialysis as well as during prolonged incubation. Among the examined proteins, PemK Sa -α14aa-GFP exhibited all the properties desirable for a crystallization candidate. It was efficiently produced, well behaved during purification and stable over time. Of more general conclusions, we showed that C-terminal fusions are advantageous during gel filtration due to homogenous elution profile. We have moreover shown that short and helical linkers are more resistant to hydrolysis. These findings are important for GFP containing fusion protein design. The fusion proteins were incubated in low ionic strength buffer for two weeks at 4°C and 22°C. The first lane in each panel represents the sample at time zero. The second and third lanes represent the samples after seven and fourteen day incubation, respectively.