A Novel Recombinant DNA System for High Efficiency Affinity Purification of Proteins in Saccharomyces cerevisiae

Isolation of endogenous proteins from Saccharomyces cerevisiae has been facilitated by inserting encoding polypeptide affinity tags at the C-termini of chromosomal open reading frames (ORFs) using homologous recombination of DNA fragments. Tagged protein isolation is limited by a number of factors, including high cost of affinity resins for bulk isolation and low concentration of ligands on the resin surface, leading to low isolation efficiencies and trapping of contaminants. To address this, we have created a recombinant “CelTag” DNA construct from which PCR fragments can be created to easily tag C-termini of S. cerevisiae ORFs using selection for a nat1 marker. The tag has a C-terminal cellulose binding module to be used in the first affinity step. Microgranular cellulose is very inexpensive and has an effectively continuous ligand on its surface, allowing rapid, highly efficient purification with minimal background. Cellulose-bound proteins are released by specific cleavage of an included site for TEV protease, giving nearly pure product. The tag can be lifted from the recombinant DNA construct either with or without a 13x myc epitope tag between the target ORF and the TEV protease site. Binding of CelTag protein fusions to cellulose is stable to high salt, nonionic detergents, and 1 M urea, allowing stringent washing conditions to remove loosely associated components, as needed, before specific elution. It is anticipated that this reagent could allow isolation of protein complexes from large quantities of yeast extract, including soluble, membrane-bound, or nucleic acid-associated assemblies.


INTRODUCTION
Proteomics requires quick, specific, and dependable methods for purifying proteins and complexes. Typically techniques for characterizing protein complexes require large quantities of highly purified protein, which is difficult to obtain for rare or unstable complexes. The most common method currently available for specific and pure protein purification in yeast is the dual affinity tag, known as TAP (tandem affinity purification) tag (Rigaut et al. 1999). This tag uses the IgG binding domain of protein A of Staphylococcus aureus (ProtA) (Uhlen et al. 1983;Nilsson et al. 1987) and a calmodulin-binding peptide separated by a Tobacco Etch Virus (TEV) protease cleavage site (Dougherty et al. 1989;Kapust et al. 2001). The tag is added on to the Cterminal end of proteins, which is bound to IgG immobilized on a bead matrix during the first step of purification. The bound protein is eluted from the matrix by TEV protease cleavage under mild conditions, and then is bound to calmodulin-coated beads in the presence of calcium during the second purification step. The bound proteins are then eluted from the calmodulin beads with EGTA (Lohman et al. 1989;Puig et al. 2001;Van Driessche et al. 2005).
Tandem affinity methods are very powerful, yet have limitations. Native protein complexes are often rare and unstable, and therefore without rapid, high yield purification methods these complexes might not be isolated cleanly using standard purification techniques. Often in vivo overexpression of rare protein complexes is impractical because of artificial interactions and low concentrations of physiologically relevant partners. Such complexes, whether soluble, bound to specific genes, or associated with membranes, can exist in less than one copy per cell, making isolation on a large scale mandatory. The problems with current approaches are that affinity purification reagents are expensive on a large scale, and the low concentration of affinity ligand per surface area/volume on affinity resins makes specific binding inefficient, leading to high 5 signal-to-noise ratios. We sought to create a new recombinant DNA system for adding affinity tags to proteins that provides enough specificity and enrichment in the purification at relatively low cost.
The result is an affinity tag that provides enough specificity and enrichment in the purification while also maintaining a relatively low cost, especially during the first, bulk isolation step. To do this we created a C-terminal dual tag (termed CelTag) consisting of a family 3 cellulose binding module (CBM3) (Levy and Shoseyov 2002) and a 13x c-Myc repeat epitope tag (Terpe 2003) separated by a TEV protease cleavage site (Dougherty et al. 1989) (Figure 1). CBM3 was chosen due to its high affinity and specificity for a cellulose matrix and it has been shown to form an independent domain at either terminus of a protein (Stofko-Hahn et al. 1992;Levy and Shoseyov 2002;Hong et al. 2007). Cellulose is an excellent matrix for purification due to its very low cost, low background due to the few proteins that have nonspecific affinity for it (in yeast and animals), and its stability in a number of buffer conditions and pH levels. Recently it was shown that CBM3 could be used as a single affinity tag to purify overexpressed proteins from recombinant DNA in Pichia pastoris (Wang and Hong 2014). A 13x c-myc repeat epitope was added in our constructs due to the availability and relatively low cost of strong monoclonal antibodies produced against this sequence. This could be particularly useful for lesser studied proteins in which only poor or no antibodies exist to study them.
A major advantage of yeast for this type of study is that the transformed DNA sequences recombine into the yeast chromosomes by sequence homology (Figure 1) at a relatively high frequency. This makes it possible to easily tag the C-termini of proteins and analyze them in vivo in the proper physiological context. To determine the effectiveness and efficiency of integration, a marker is engineered into the tag, which allows for the yeast transformants to be 6 isolated by growth on selective media. Here we provide a proof of concept example of isolation of a soluble yeast protein .

MATERIALS AND METHODS
All reagents and chemicals are reagent grade and purchased from Sigma-Aldrich (St. Louis, MO, USA) and Fisher Scientific (Pittsburgh, PA, USA) unless otherwise noted.
Strains: DNA fragment containing the entire tag and selection marker was PCR-amplified from the pRS426 vector by PCR and was ligated into the pGEM-T Easy vector according to manufacturer's specifications (https://www.promega.com/resources/protocols/technicalmanuals/0/pgem-t-and-pgem-t-easy-vector-systems-protocol/). Correct sequence of the tag was confirmed by Sanger sequencing (University of Michigan Sequencing Core). Plasmid will be available through Addgene (Plasmid #66562) and the sequence and full map are provided inSupplementary Figure 1.  Figure S2 and Table   S1for sequences). The tag was amplified from the plasmid with the homologous ends to the ORF of interest and was purified by electrophoretic DNA purification and extraction as described above. To ensure there was enough fragment for efficient transformation into yeast, a total of 400 µL of PCR was done (4 x 100 µL reactions).

Yeast Cell Lysate Preparation
Control or transformed yeast cells in 6 L of YPD medium were grown to early log phase, an converter, ½" tapped bio horn, and 1/8" tapered microtip as the probe assembly) on ice for a total of 1.5 minutes (3 seconds on, 10 seconds off) at 25% amplitude. Debris was spun out at 4000 RPM for 5 minutes at 4 o C and lysate was then aliquoted for binding experiments and stored at -

Preparation of Regenerated Amorphous Cellulose (RAC)
RAC was prepared by acid treatment as previously described (Wang and Hong 2014). The resulting material is a rather gelatinous and can be difficult to work with. After the acid treatment, the material was resuspended once in CBM3 binding buffer with 2M salt, once in CBM3 binding buffer with no salt, and 3 times in CBM3 binding buffer. The RAC was resuspended and stored in a 1:1 slurry of cellulose and binding buffer.

11
Cellulose was washed 3 times in an 20 volume excess of 2 M NaCl CBM3 binding buffer, 2 times in no salt CBM3 binding buffer, and 3 times in CBM3 binding buffer. Cellulose was resuspended in a 1:1 slurry with CBM3 binding buffer.

Cellulose Pulldowns
For each pulldown reaction, 450 µL of cell lysate as prepared above, was used to resuspend variable volumes of packed cellulose pellet and allowed to bind at 4 o C for 20 minutes with gentle agitation. Cellulose pellets were washed four times in an excess volume (50-fold) of binding buffer before elution in 50 µL of SDS-Page Loading Buffer (1X working: 80 mM Tris-HCl pH 6.8, 10% glycerol, 2% SDS, 100mM DTT, 0.2% bromophenol blue, 100 mM 2-Mercaptoethanol), a binding buffer with varied additives, or a TEV protease elution.

TEV Protease Elution
TEV protease was purified using a previously described plasmid encoding a His6 tagged TEV clone and published protocol (Tropea et al. 2009 according to manufacturer's specifications (http://www.bio-rad.com/en-us/product/coomassiestains).

13
To test whether a protein fused with the CelTag peptide sequences on its C-terminal end could efficiently bind to cellulose, we chromosomally tagged the ORF for Pgk1, a soluble cytoplasmic protein, as a proof of concept. The CelTag DNA sequences were chromosomally-integrated onto the C-terminus of the PGK1 gene using homologous recombination ( Figure 1A), allowing purification of tagged protein using cellulose as the affinity matrix, eluting from the matrix with TEV protease or SDS ( Figure 1B). Previously it was shown that commercially available microgranular cellulose powder has suboptimal accessible surface area and binding capacity compared to regenerated amorphous cellulose (RAC) made by phosphoric acid treatment of the microgranular material (Hong et al. 2007;Hong et al. 2008;Wang and Hong 2014). Although this treatment causes the cellulose to form more gel-like pellets and is somewhat burdensome for regular use, we compared whether microgranular cellulose or RAC more efficiently pulled down Pgk1-CelTag to a degree that would justify its use. Pgk1-CelTag was pulled down from 450 µL of extract with varying amounts of both regenerated amorphous cellulose and untreated microgranular cellulose. Once the binding was complete the cellulose pellets were eluted in 50 µL of SDS-Page Loading Buffer and subsequently run on a protein gel (Figure 2A). The results confirm that the RAC is more efficient at pulling down the tagged protein, though the microgranular cellulose is at least as efficient as the RAC at recovering tagged protein (76 percent from RAC versus 81 percent from microgranular at the highest concentrations used, Figure 2A). This obviates the need for RAC, since it is more difficult to use, and thus, the subsequent experiments were done using microgranular cellulose. The elutions that gave the most tagged protein from 450 µL of cell extract were the packed pellet of 25 µL or 50 µL microgranular cellulose. Thus, subsequent experiments used approximately 1/20 volume packed 14 cellulose relative to extract volume. Tagged protein was not detected in the unbound fractions (data not shown) suggesting some protein might be unrecoverable from the cellulose matrix.

Pgk1-CelTag Bound to Cellulose is Resistant to Stringent Wash Conditions
Next where about 35% is eluted ( Figure 2B). Moderate acidic and basic conditions (pH 3 or 10) also eluted only small amounts of protein ( Figure 2B) and binding is almost completely resistant to concentrations of the nonionic detergents NP-40 and Triton X-100 above their critical micelle concentrations, with only 2% of the tagged protein eluted in each case ( Figure 2B). Lastly, we investigated whether cellobiose, the cellulose sugar dimer, is able to competitively elute the protein from the matrix. At concentrations just below saturation, cellobiose is only able to elute 5% of the total protein off the cellulose over the time course used ( Figure 2B). These data suggest that stringent washing prior to elution could be used to remove the vast majority of molecules that might bind non-specifically to the cellulose matrix or the tagged protein.

TEV Protease Efficiently and Specifically Elutes Pgk1-CelTag from Cellulose
We next determined the approximate amount of TEV protease needed for optimal release of tagged protein from the cellulose. At the highest concentration of TEV protease, 73% of the tagged protein relative to input is eluted from the cellulose (indicated by a smaller band representing the Pgk1 with the cellulose binding module removed) ( Figure 3A, lanes 1 and 3).
The TEV elution is also very specific and eliminates most residual non-specific binding contamination seen if the pellet is eluted in SDS ( Figure 3B, lanes 2 and 3). To remove the recombinant His6-tagged TEV protease from the elution, we passed the eluted sample through nickel-coated beads ( Figure 3B, lane 3 and 4). The total yield and fold-purification using the above method to purify a CelTagged protein is summarized in Figure 3C. In our hands the CelTag purification scheme is able to provide ~ 470-fold purification of the protein of interest, more importantly recovering about 73% of the total starting protein, using only a single affinity step followed by removal of the TEV protease.

DISCUSSION
We have shown as a proof of principle that the CelTag affinity tag is able to efficiently recover a majority of total endogenous, tagged protein from crude cellular extracts with nearly complete purity. In contrast to previous uses of the cellulose tags, it appears that using commercially available microgranular cellulose is as effective as using the more labor-intensive RAC at binding the CBM3 domain as an affinity tag. This is useful since microgranular cellulose is much easier to transfer and does not require acid treatment. The recombinant DNA construct encoding the CelTag has been constructed to also contain a 13x c-Myc epitope tag, so that the fragment to be used for tagging a chromosomal ORF can be lifted by PCR to either contain or delete the c-Myc portion as an additional isolation or detection tool.
Another useful aspect of the CelTag is that relatively stringent washing conditions are possible once the tagged proteins are bound to the cellulose support. High salt, non-ionic detergents and urea were all tested and showed that reasonably severe, non-denaturing conditions can be used to minimize non-specific or loosely associated macromolecules from the isolated target. Cellobiose elutions were attempted to see whether the tag could be competed off the cellulose without the use of TEV protease, but with soluble concentrations of cellobiose the tag is unable to be competed off within the tested time frame, likely due to the higher affinity and specificity that the CelTag has for cellulose. Therefore, TEV protease was optimized for efficient elution of CelTagged protein off cellulose.
We envision this protocol to be used for the specific tagging of the chromosomal open reading frames, and purification of rare multipolypeptide, protein-RNA and protein-DNA complexes.