Episomal Vectors for Rapid Expression and Purification of Proteins in Mammalian Cells

Research projects in life sciences aim at studying and better understanding biological systems. Over the past 50 years, tremendous advances in molecular biology and biochemistry have provided essential tools to dissect biological processes down to the molecular level. Most of the time, when studying the structure and function of proteins, obtaining sufficient quantities of the native form of the protein isolated from the relevant cells or tissue is not feasible. The development of recombinant DNA technologies to clone and express genes encoding proteins of interest has revolutionized the design and execution of research projects (Cohen et al., 1973). Indeed, access to purified recombinant proteins enables a wide spectrum of studies, ranging from structural characterization of proteinproteinand protein-nucleic acid interactions to immunization programs to generate antibodies as research tools. The availability of sufficient quantities of purified recombinant proteins is often key to success. Furthermore, recombinant approaches for the production of proteins have profoundly impacted biomedical research and drug development as they have opened the possibility of producing clinical grade proteins as drugs. This has subsequently paved the way for the emergence and fast development of protein biologics that today represent a very successful and quickly expanding class of drugs (Saladin et al., 2009; Chiverton et al., 2010).

or modifications obtained via site-directed mutagenesis, are often needed in the course of a project and thus require the expression and purification of many protein variants in short periods of time.Therefore, the flexibility and speed of a particular system have also to be taken into consideration.Ideally, an expression system should combine high yield, ease of purification, high product quality and short timelines.
In this chapter, we describe the design and use of multicistronic episomal protein expression vectors combined with improved cell culture methods and single step affinity purification in order to meet these requirements.This approach is rapid (4-6 weeks) and can be used in any laboratory equipped for mammalian cell culture and standard protein purification for the production, in the milligram per liter range, of biologically active recombinant proteins from cell culture supernatants.

Approaches for recombinant protein expression in mammalian cells
When studying human or mammalian proteins, expression in mammalian cells not only provides the optimal machinery for proper folding and post-translational modifications, but also facilitates the expression of large and multimeric protein complexes.Several approaches for recombinant protein expression in mammalian cells can be envisaged that dramatically differ in overall yield, workload and timelines (Colosimo et al., 2000).Smallscale transient transfections offer a fast and flexible approach for producing microgram quantities of proteins in a short period of time (days).Different methods have been described for the delivery of plasmid DNA into cells that drives the transient expression of the gene of interest.Up-scaling this approach is feasible in order to produce larger amounts of proteins (milligrams to grams) in a short time.However large-scale transient expression requires significant quantities of both exponentially growing cells and DNA, as well as specialized equipment and is thus not easy to implement in all laboratories (Geisse et al., 2005;Backliwal et al., 2008).Using transient approaches, a new transfection has to be performed for each protein production batch.
The most commonly used strategy for large-scale protein expression (milligram to kilogram scale) is the establishment of stable cell lines, in which the expression plasmid incorporates into the host cell genome (Hacker et al., 2009).The plasmid also includes a marker that allows the selection and clonal amplification of cells that have stably integrated the expression plasmid (Costa et al., 2010).Once the genetic stability of the cell line has been established, it can be expanded, cryopreserved and used for multiple production runs thus maximizing batch to batch consistency of the expressed protein.The main limitation is that stable cell line generation is time-consuming and laborious.It is therefore well suited for the production of proteins at industrial scale or for the production of therapeutic proteins, but not for covering the evolving needs of a research project.
Semi-stable expression offers a compromise between transient transfection and stable cell line generation.In this case, following transfection with a plasmid containing a selectable marker, pools of cells are expanded under selective pressure to obtain large volumes of cells expressing the protein of interest in a relatively short time (weeks).The main advantages and limitations of the methods described above are summarized in

Episomal vectors
The generation and amplification of semi-stable cell pools is performed under selective pressure, for instance using an antibiotic resistance gene (Lufino et al. 2008;Wong et al. 2009).After transfection, cells that have integrated plasmid DNA into their genome in a location that enables expression of the selectable marker will grow and expand.Depending on the genome integration site, the expression level of the selection marker and the gene of interest can vary significantly.Episomal vectors present the advantage that they can replicate and propagate extrachromosomally in the transfected cells without the need for genomic integration.Episomal vectors contain sequences from DNA viruses, such as bovine papilloma virus 1, BK virus or Epstein-Barr virus.The expression of viral early genes in the host cell such as the Epstein-Barr virus nuclear antigen 1 (EBNA-1) activates the viral origin of replication that is present in the vector, allowing its independent replication.This leads to an efficient retention of multiple copies of the plasmid expressing the gene of interest despite a non-equal partitioning between the dividing cells (Van Craenenbroeck et al., 2000).This high retention rate combined with the selective pressure ensures that expanded cells contain the expression construct.In addition, the high plasmid copy number leads to amplification of the gene of interest and higher protein expression similar to transient transfection experiments (Mazda et al., 1997).Here we focus on the use of the pEAK8 vector that encodes the puromycin resistance gene as a selection marker, the Epstein-Barr virus nuclear antigen 1 (EBNA-1) and the oriP origin of replication (Magistrelli et al., 2010).

Multicistronic expression vectors
Vectors that can drive multiple gene expression have several advantages.Firstly, they can be used for the production of multimeric protein complexes resulting from the assembly of different polypeptides.Such protein complexes are frequently found in nature and their structural and functional properties can differ from those of their individual subunits.A series of single and dual promoter vectors was generated, based on the pEAK8 episomal vector described above (Figure 1).These vectors incorporated a multicistronic design enabling the coexpression of 2 to 4 independent genes in addition to the antibiotic resistance genes and viral elements of the original episomal vector.The genes encoding the protein of interest can be cloned downstream of the EF1 or SR promoters that drive strong gene transcription.One or two subsequent internal ribosome entry sites (IRES) drive the translation of the second and third genes (Komar et al., 2005).The gene located after the first IRES is BirA and encodes a biotin ligase that can add a biotin molecule to a protein fused to the biotin acceptor peptide AviTag™ (Tirat et al., 2006).In all the vectors described here, enhanced green fluorescent protein (EGFP) was placed after the last IRES.In a multicistronic transcript, the last cistron is in principle the least translated.Thus, EGFP expression can be used as a reporter to indicate whether the genes of interest are also expressed, although it has to be noted that there is not necessarily a correlation between the expression levels of the protein of interest and EGFP.
In this chapter, we focus on the expression of extracellular proteins or protein complexes.
Their secretion into the culture medium is mediated by a leader sequence that can be either the original leader sequence of the protein to be expressed or a generic one.We successfully used the CD33 and Gaussia P. leader sequences for a variety of proteins (Magistrelli et al., 2010).However, significantly different yields can be observed depending on the choice of leader sequence and this parameter should therefore be considered in order to optimize expression levels that are not satisfactory.When biotinylation of the secreted protein is desired, the biotin ligase must also be secreted so that it can add a biotin to the AviTag™ either during the secretion process or in the extracellular milieu.It is therefore mandatory to add a leader sequence to the BirA gene in order to obtain a biotinylated product.

Transfection and selection
After molecular cloning of the gene -or genes -of interest in one of the vectors described above, the constructs can be verified by DNA sequencing.The plasmids are then transfected into mammalian cells using a liposome-based transfection reagent such as TransIT-LT1 (Mirus, Madison, WI).The transfection step requires only small quantities of DNA and cells, typically 2x10 5 cells and 2 μg of plasmid DNA per well and the transfection is carried out in a 6-well plate.Although different mammalian cell lines can be used, in the examples given below, transformed human embryo kidney monolayer epithelial cells (PEAK cells) were transfected.These cells stably express the EBNA-1 gene, further supporting the episomal replication process, are semi-adherent and can be grown under standard conditions in a cell culture incubator (5% CO 2 ; 37 °C in DMEM medium supplemented with 10% fetal calf serum).After 24h, cells were placed under selective conditions by adding medium containing 0.5-2 μg/mL puromycin, as cells harbouring the episomal vector are resistant to this antibiotic.48h after transfection, its efficiency can be evaluated via the brightness of the EGFP signal as well as the proportion of EGFP positive cells in the wells, using epifluorescence microscopy.

Amplification and production
Cells are maintained in a serum-containing medium, which allows for fast growth, high viability and fast expansion without adaptation to serum-free medium.The selection and amplification process of the pool can easily be monitored by the increase in EGFP signal either using epifluorescence microscopy or flow cytometry (Figure 2).At this stage the expression of the protein of interest can be tested by ELISA or western blot analysis of the supernatant.This early evaluation point at the beginning of the selection process is not absolutely required but provides an indication that the protein can be expressed and secreted in this system.After one week, the cells are transferred to larger vessels and kept under selective pressure in order to expand the transfected cell population.Two to three weeks after transfection, cells can be used to seed Tri-flasks (Nunc) or disposable CELLine bioreactors (Integra) for the production step (Figure 3).Tri-flasks are cell culture vessels that contain three levels for cells to adhere and thus maximize cell density in a limited space.The CELLine is a two compartment bioreactor that can be used in a standard cell culture incubator.The smaller compartment (15 mL) contains the cells and is separated from a larger (one liter) medium-containing compartment by a semi-permeable membrane with a cut-off size of 10 kDa (Bruce et al., 2002;McDonald et al., 2005).This system allows for the diffusion of nutrients, gases and metabolic waste products, while retaining cells and secreted proteins in the smaller compartment.It is also possible to use serum-free medium in the cell compartment and complete medium in the larger compartment.This allows for the secretion of the protein of interest into serum-free medium, which facilitates the purification process by decreasing the amount of contaminants.As the medium and cell compartments can be accessed independently, medium can be replaced or complemented with fresh medium without losing cells or the protein of interest.It overcomes the limitations of standard cell culture containers and offers the advantage that the secreted protein remains concentrated in smaller volume facilitating downstream purification.Using both systems, the culture is maintained for 7-10 days before harvest of the supernatant.As the medium contains serum, the cells maintain high viability and several production runs can be generated using the same cells and containers.

Protein purification
In order to streamline the overall process, an important objective is to efficiently purify the secreted recombinant proteins with a single immobilized metal ion affinity chromatography step (IMAC).This affinity purification approach is well established for proteins containing a hexahistidine tag (Block et al., 2009).However, optimization was required to efficiently purify recombinant protein from supernatants containing 10% FCS.After harvest, the cell culture supernatants are clarified by centrifugation and filtered through a 0.22 μm membrane.The supernatant from Tri-flasks or other standard cell culture vessels have to be concentrated 20-40 times using a concentration device such as a SartoFlow 200 (Sartorius) with a membrane having an appropriate cut-off size to retain the protein of interest.As mentioned above, this step is not required using the CELLine bioreactor due to the low volume recovered from the cell compartment.In addition, the concentration step increases the concentration of both the protein of interest and high molecular weight contaminants such as bovine serum albumin or immunoglobulins.In contrast, the supernatant retrieved from the cell compartment of the CELLine bioreactor contains concentrated recombinant protein and reduced levels of contaminants as they cannot cross the 10 kDa membrane separating the two chambers of the reactor.This increased recombinant protein to contaminant ratio greatly enhances the purification efficiency by IMAC.The concentrated supernatant is then supplemented with 100 mM imidazole and loaded onto Ni-NTA affinity chromatography resin (Qiagen).The relatively high concentration of imidazole minimizes binding of contaminants to the resin.After a wash step, proteins are eluted at a flow rate of 2 mL/min using a 30 mL imidazole gradient (20-400 mM imidazole) on an ÄKTA Prime chromatography system (GE Healthcare).The elution gradient further improves the purity of the recombinant protein but can be replaced by a step elution approach if a chromatography system is not available.The eluted fractions can be analyzed by SDS-PAGE or ELISA to determine their recombinant protein content.The fractions of interest are pooled and desalted on PD-10 columns (GE Healthcare) equilibrated with phosphate buffered saline or another appropriate buffer.The desalted proteins can then be quantified using various techniques and their purity analyzed by SDS-PAGE.
During the expansion step of the cell pools, several vials of cells can be cryopreserved in liquid nitrogen.These frozen pools can be thawed at a later stage and rapidly expanded under selective conditions to produce recombinant protein without the need for a new transfection step.This possibility further accelerates the timelines for the generation of additional recombinant protein batches and is a clear advantage over transient transfection approaches (Table 1).
We have applied the process described above to express and purify 16 mammalian proteins (Table 2).In most cases, several milligrams of highly pure recombinant protein were obtained per liter of cell culture supernatant (Magistrelli et al., 2010).The proteins carried v a r i o u s t a g s a t t h e N -o r C -t e r m i n u s a n d c o u l d b e b i o t i n y l a t e d i n v i v o d u r i n g t h e production step via an AviTag TM .Most of these proteins were shown to be functionally active in cell-based assays.
Three case studies are reported below, illustrating the expression, purification and characterization of monomeric, homodimeric and heterodimeric recombinant proteins.

Case studies 6.1 Monomeric proteins: Recombinant biotinylated human
Human CD16b, also called FcRIIIb, is a member of the receptor family for the Fc of immunoglobulins (Takai, 2002).CD16b is a low affinity, GPI-anchored, monomeric receptor expressed on neutrophils and eosinophils.In order to characterize the binding of human IgG to this receptor, a DNA fragment encoding the extracellular portion of hCD16b fused to a Cterminal hexahistidine and an AviTag TM was cloned into the single promoter tricistronic vector that drives the co-expression of BirA and EGFP (Figure 1).After transfection and selection, hCD16b was purified in a single step from the cell culture supernatant as described above.The elution fractions corresponding to the main elution peak of the chromatogram were pooled and desalted into phosphate buffered saline.SDS-PAGE analysis showed that hCD16b was highly pure and had the expected molecular weight of 43 kDa (Figure 4).
In order to verify that hCD16b was biotinylated during expression, an aliquot of each pooled fraction was added to streptavidin coated microplates.After incubation, the plate was washed and the immobilized hCD16b detected using a horseradish peroxydase (HRP)coupled anti-hexahistidine antibody (Figure 5).The ELISA signals correlated with the intensity of the bands on the SDS-PAGE gel and indicated that hCD16b was efficiently biotinylated, thus facilitating its immobilization on a solid surface.

81
The ability of purified hCD16b to bind IgG was tested by Surface Plasmon Resonance (SPR) using a Biacore 2000 instrument (GE Healthcare).A CM5 biosensor chip was coated with streptavidin followed by injection of hCD16b, resulting in efficient immobilization of the receptor on the chip surface.Kinetic experiments were performed by injecting various concentrations of hIgG1, ranging from 83 nM to 1.3 μM (Figure 5).Dose dependent binding was observed and the affinity of the interaction could be determined (KD=7.18x10 - ± 3.2 x10 -5 M).The affinity of human IgG1 for hCD16b was found to be similar to previously described values (Bruhns et al., 2009).The sensorgrams were obtained by injecting hIgG1 on hCD16b immobilized on a streptavidin surface (right).A schematic representation of the IgG-CD16b interaction on the chip surface is represented in the top right corner.

Homodimeric proteins: Recombinant human IL-17F
Interleukin 17F (IL-17F) is a member of the IL-17 cytokine family and has been shown to play a pro-inflammatory role, particularly in asthma (Kawaguchi et al., 2009).IL-17F is a secreted protein that forms a homodimer linked by a disulfide bond (Hymowitz et al., 2001).This protein was expressed using the single promoter bicistronic variant of the vector (Figure 1).The transfection, selection and purification process described above was applied and the purity of dimeric IL-17F was confirmed by denaturing SDS-PAGE in reducing and non-reducing conditions (Figure 6).As expected the IL-17F monomer and disulphide-linked dimer had an apparent molecular weight of 20 kDa and 40 kDa, respectively.The biological activity of purified hIL-17F was assessed in a cell-based assay (Yao et al., 1995).Human fibroblasts secrete IL-6 in response to hIL-17F and this activity can be inhibited by a neutralizing antibody directed against hIL-17F.The neutralizing activity of the antibody was tested in the assay using hIL-17F purified from the supernatant or a commercial source of the cytokine (PeproTech).The results indicated that both cytokines have equivalent biological activities and can be neutralized in a similar fashion by the antibody (Figure 7).

Heterodimeric proteins: Recombinant human CD79A/B
As an example demonstrating successful heterodimeric protein complex expression, the sequences encoding the extracellular domains of human CD79A and CD79B were cloned into a dual promoter, tricistronic vector (Figure 1).The native CD79A/B heterodimer is expressed on B lymphocytes and is the signalling component of the B cell receptor complex (Chu et al., 2001).For recombinant expression, the two proteins were fused to different tags.A hexahistidine tag was introduced at the C-terminus of CD79B for purification by IMAC and an AviTag™ at the C-terminus of CD79A for in vivo biotinylation.During the purification step, CD79A/B heterodimer and CD79B homodimer complexes were purified via the hexahistidine tag on CD79B.The purified protein complexes analyzed by SDS-PAGE presented a diffuse pattern due to glycosylation of the proteins (Figure 8).As only CD79A is biotinylated, only the heterodimeric complex can be specifically immobilized on streptavidin coated surfaces.To confirm the presence of the CD79A/B heterodimer in the purified fractions, aliquots were incubated in streptavidin-coated ELISA plates.After washing, commercial anti-CD79A and anti-CD79B antibodies were added to different wells and detected using a HRP-labeled Fcγ specific antibody (Figure 9).Positive signals obtained with both anti-CD79A and anti-CD79B antibodies demonstrated that the heterodimer was efficiently produced and captured via biotin-streptavidin interaction.

Conclusions
A number of considerations can influence the choice of system for the expression of recombinant proteins, but the final intended use of the protein is a key determining factor.For most applications, the recombinant protein should closely mimic the structural and functional properties of the native protein.For this reason, mammalian expression that provides the appropriate folding and complex post-translational and secretion machineries represents the system of choice for the study of human proteins, in particular for therapeutic applications (Andersen et al., 2002).High yields, flexibility and speed are also important parameters that are difficult to combine in a single and ideal expression system.Transient expression via plasmid transfection into mammalian cells provides maximal flexibility and speed, at the expense of yield.Indeed, only microgram amounts of protein can be obtained unless performed at large scale, a procedure that has its own technical challenges and is therefore not easily implemented in most laboratories (Hacker et al. 2009).At the other end of the spectrum, the establishment and selection of stable cell lines supporting high expression levels is time consuming but is clearly a system of choice when large amounts of protein are required.In addition, the clonal nature of the cell line increases product homogeneity and batch-to-batch consistency, two highly desirable features for industrial applications.However, neither approach is fully satisfactory when conducting research projects that involve the development of protein-protein interaction assays, structural characterization, immunization or screening procedures.Such activities require milligram amounts of protein and often multiple variants, fusions or tagged version of the same protein have to be generated.
The expression system described in this chapter contributes to bridging the gap between yield and speed by providing several attractive features: integration-free maintenance of the expression vector via autonomous episomal replication; single or dual promoter multicistronic vector design for the co-expression of proteins; secretion of biotin ligase for single site in vivo protein biotinylation; co-expression of EGFP for the monitoring of transfection efficiency, selection and amplification of cell pools; cryopreservation of cell pools for additional batch productions; single step affinity purification and use of disposable bioreactors.The latter element, although not strictly required, significantly enhances the overall quality of the process by providing highly concentrated supernatants, containing lower levels of serum derived contaminants, thus improving the performance of the affinity chromatography step.This 4-6 weeks process requires standard cell culture and protein purification equipment and can therefore be implemented in most laboratories.Beyond speed and yield, the possibility to obtain single site biotinylated proteins facilitates the development of protein-protein interaction assays via simple biotin-streptavidin oriented immobilization of one of the interacting partners.
Finally, as illustrated by several examples, the mammalian cell machinery offers the possibility to produce homodimeric and heterodimeric protein complexes in significant quantities.
In our laboratory, the availability of this approach has significantly simplified and streamlined the production of high quality recombinant proteins and supported multiple aspects of our research programs.We therefore believe that it could also benefit other

Fig. 2 .
Fig. 2. EGFP expression in transfected pools of PEAK cells after 2 weeks of selection and propagation, monitored by epifluorescence microscopy (left) or flow cytometry (right).

Fig. 3 .
Fig. 3. Schematic representation and timelines of the overall process.

Fig. 4 .
Fig. 4. Purification of hCD16.Chromatogram of the gradient elution step (left).The indicated fractions were collected in several pools, desalted and analyzed by SDS-PAGE (right).

Fig. 5 .
Fig. 5. Characterization of hCD16b by ELISA (left) and by Surface Plasmon Resonance on a Biacore 2000 system (right).Pooled fractions of purified hCD16b were captured on a streptavidin-coated ELISA plate and detected with and anti-hexahistidine antibody (left).The sensorgrams were obtained by injecting hIgG1 on hCD16b immobilized on a streptavidin surface (right).A schematic representation of the IgG-CD16b interaction on the chip surface is represented in the top right corner.

Fig. 6 .
Fig. 6.Purification of human IL-17F homodimer.Chromatogram of the gradient elution step (top).The indicated fractions were collected in several pools, desalted and analyzed by SDS-PAGE in non-reducing conditions (bottom left) and in reducing conditions (bottom right).

Fig. 8 .
Fig. 8. Purification of human CD79A/B heterodimer.Chromatogram of the gradient elution step (left).The indicated fractions were collected in several pools, desalted and analyzed by SDS-PAGE (right).

Fig. 9 .
Fig. 9. Schematic representation of the ELISA used for the detection of CD79A/B heterodimer in the pooled elution fractions (top).ELISA signals obtained using anti-CD79A (bottom left) and anti-hCD79B (bottom right) antibodies and dilutions of the pooled fractions (pools 2 to 5).Biotinylated hCD79A and hCD79B homodimers were used as positive and negative controls.
r e s e a r c h g r o u p s a n d b e c o m e m o r e w i d e l y u s e d f o r t h e e x p r e s s i o n o f r e c o m b i n a n t proteins.

Table 1 .
Characteristics of different mammalian cell expression systems.