Cloning of a novel endogenous promoter for foreign gene expression in Phaeodactylum tricornutum

Phaeodactylum tricornutum is a model diatom, and its genomic sequence data and expressed sequence tag databases are available. This study was to discover a new endogenous promoter that drives strong constitutive expression of a protein of interest in P. tricornutum. To find promoter candidates, the intracellular proteins of P. tricornutum grown to stationary phase were extracted and identified by LC–MS/MS. Glutamine synthetase (GLNA) was one of the most abundantly expressed proteins during the stationary phase. Promoter is usually located on 5′ upstream region of open reading frame of the gene. Thus, two fragments of 5′ upstream region of the GLNA gene as putative promoters, 501 and 997 bp long, were amplified and cloned into enhanced Green Fluorescent Protein (eGFP) reporter systems. The constructed reporter systems were transformed into P. tricornutum and the eGFP expression levels were compared to those of reporter systems using the promoters of fcpA (fucoxanthin chlorophyll a/c binding protein A) and CIP1 (putative replication-associated proteins of a Chaetoceros lorenzianus-infecting DNA virus) as controls. The expression of eGFP driven by either GLNA promoter (501 and 997 bp) was linearly related to cell density, and eGFP was expressed constitutively regardless of the cultivation phase. The eGFP expression level driven by the GLNA promoters was at least 4 times higher than the fcpA-driven eGFP expression level at the stationary phase. The 501 and 997 bp regions of the GLNA promoter had similar activity patterns for transcribing the downstream gene. These results indicate that at least 501-bp region of the GLNA promoter can be used as a strong constitutive promoter in genetic engineering of P. tricornutum.


Introduction
Alongside the gradual increase in the human population, the accompanying consumption of material goods is also increasing. The most important things include food, fuel, and pharmaceuticals. Nonrenewable natural resources will be exhausted in the near future, and we will need new systems that are able to create these useful products. Over the past few decades, researchers have made substantial progress in this field and are already developing these systems. Many of these new production systems involve the use of diatoms, which are eukaryotic phytoplankton which is able to fixate carbon and nitrogen sources while producing oxygen (Saade and Bowler 2009). Diatoms are increasingly used as cellular ''factories'' for commercial and industrial applications, including the production of biopharmaceuticals, recombinant proteins, and biofuels. Diatoms are preferred for these applications because they grow relatively rapidly and require only seawater, light, and relatively inexpensive inorganic nutrients such as Electronic supplementary material The online version of this article (doi:10.1007/s13765-016-0235-y) contains supplementary material, which is available to authorized users.
Phaeodactylum tricornutum is a model diatom, and both its genome sequence and an expressed sequence tag (EST) database spanning 16 different growth conditions (Bowler et al. 2008) are available. In addition, molecular toolbox for genetic modification has been established (Apt et al. 1996;Siaut et al. 2007;De Riso et al. 2009). P. tricornutum has been genetically engineered to produce various valuable products such as polyunsaturated fatty acids (Hamilton et al. 2014), glycerol, neutral lipid (Yao et al. 2014), and recombinant protein (Hempel et al. 2011). Generally, the strong endogenous fcp promoter, which was isolated and characterized from very closely related tandem genes that encode fucoxanthin chlorophyll a/c binding proteins (Bhaya and Grossman 1993), has been used in the genetic manipulation of P. tricornutum. Lately, the heterologous promoter CIP1 has been isolated from Chaetoceros lorenzianus-infecting DNA virus and applied in P. tricornutum (Kadono et al. 2015). Unfortunately, fcp promoter is highly dependent on light and the circadian rhythms of diatom species (Falciatore et al. 1999;Seo et al. 2015). CIP1, meanwhile, is suboptimal because heterologous promoters may be less efficient than endogenous promoters (Diaz-Santos et al. 2013). Thus, a strong, endogenous, constitutive promoter for foreign gene expression and metabolic engineering in P. tricornutum is urgently needed.
This work identifies a novel endogenous promoter that can constitutively express recombinant proteins in P. tricornutum. To do this, we first analyzed proteins expressed in P. tricornutum and selected the glutamine synthetase (GLNA) promoter as a candidate. Second, the GLNA promoter was fused to an eGFP reporter system and the expression level of eGFP was measured. The GLNA promoter strongly and consistently expressed the reporter protein, eGFP, in P. tricornutum.

Culture condition
The P. tricornutum Bohlin UTEX 646 strain was purchased from the UTEX Culture Collection of Algae (The University of Texas at Austin, USA). Cultures were grown in f/2 media made with 50% artificial seawater (Guillard 1975). Mixed antibiotics, which consisted of ampicillin (50 lg/mL), kanamycin (10 lg/mL), and streptomycin (50 lg/mL), were added to the media to suppress bacterial contamination. Cultures were incubated at 20°C under shaking at 200 rpm with constant aeration and continuous lighting from a cool white fluorescent lamp (1600 lx), if culture conditions were not described independently.

Preparation of the proteins expressed in P. tricornutum
For the analysis of the proteins expressed in P. tricornutum, cultures of P. tricornutum were grown as described above. Cells were grown to 1 9 10 7 cells/mL, pelleted, and suspended in a lysis buffer, pH 8.0, containing 50 mM Tris-Cl, 150 mM NaCl, and 0.1% Triton X-100 (De Riso et al. 2009). They were then vortexed, sonicated, and the cell lysates were collected by the centrifugation at 17,000 rpm for 20 min at 4°C. The protein concentrations were determined by the Bradford assay (Biorad, Hercules, CA, USA). Samples were prepared separately from three independent cultures.

SDS-PAGE and in-gel digestion
Protein samples (30 lg each) were resolved on 12% SDS-PAGE gel and visualized with Coomassie blue. The most abundant proteins were excised from the gels (Fig. 1A), and the gel slices were washed with deionized water. The gel slices were subjected to an in-gel digestion with slight modifications (Shevchenko et al. 2006).

LC-MS/MS analysis and protein identification through database searching
The peptide samples prepared by in-gel digestion were analyzed with a LTQ XL linear ion trap mass spectrometer (Thermo Scientific, San Jose, CA, USA) equipped with a nano-HPLC system (Eksigent, Dublin, CA, USA). Five microliters of each peptide sample was injected. Sorcerer-SEQUEST (Sage-N research, Milpitas, CA, USA) was used to analyze the MS/MS raw data, and Scaffold 4 (Proteome Software, Portland, OR, USA) was used to visualize and validate the MS/MS-based peptide and protein identifications. SEQUEST was set to search the Uniprot P. tricornutum protein database (15784 entries), with up to 2 missed trypsin cleavage sites (Kang et al. 2014).

Plasmid construction
The fcpA promoter was removed from a pPha-T1 expression vector which has zeocin-resistant gene as a selection marker (Zaslavskaia et al. 2000) by NdeI/EcoRI enzymes. The CIP1 promoter (Kadono et al. 2015) was synthesized by Bioneer (Daejeon, Korea) with flanking NdeI/EcoRI enzyme sites. Then, the CIP1 promoter and the potential promoter regions of GLNA (501 and 997 bp long), as amplified by PCR, were inserted at NdeI/EcoRI enzyme sites into the pPha-T1 vector. All of the primers used in this study are listed in Supplementary Table 1. Next, the reporter protein (eGFP) coding gene was PCR amplified from a pEGFP-C2 vector and inserted into the pPha-T1 vector at EcoRI/BamHI enzyme sites. Finally, a pPha-T1 vector containing the fcpA promoter but without the egfp gene was used as a mock construct. All sequencing was done by Macrogen (Seoul, Korea).

Transformation
The cells (3 9 10 8 ) were plated on the central third of the f/2 agar plates, which also contained mixed antibiotics. The plates were incubated at 20°C overnight. The next day, 3 mg of 0.6 lm gold microcarriers (9204298 RevD, Biorad) was coated with 15 lg of DNA for each of the vector constructs, according to the manufacturer's instruction for the Biolistic Ò PDS-1000/He Particle Delivery System (165-2257, Biorad). The coated microcarriers were used for four bombardments. Cell bombardment was carried out at 1350 psi. The distance between the target plate and the starting point of bombardment was 6 cm. Bombarded cells were allowed to recover at 20°C for 2 days, and after that, cells were resuspended in f/2 medium, and 10 8 cells were plated on f/2 agar selection plate containing 100 lg/mL zeocin. After 4 weeks of selection, resistant colonies were further analyzed. During all experiments, resistant colonies were cultured in f/2 medium with 100 lg/mL zeocin and mixed antibiotics under continuous lighting from a cool white fluorescent lamp (1600 lx).

Fluorescence measurement of eGFP by fluorometer
After measuring the cell density, 200 lL of cell culture was centrifuged at 3500 rpm for 5 min at 4°C. The pelleted cells were lysed with lysis buffer, pH 8.0, containing 50 mM Tris-HCl, 150 mM NaCl, 1 mM PMSF, 1 mM EDTA, 1% nonidet, and 0.1% Tween-20. Lysates were incubated on ice for 20 min and then centrifuged at 3500 rpm for 15 min at 4°C. The eGFP fluorescence of the supernatant was measured by excitation at 485/20 nm and emission at 528/20 nm using a fluorometer (Multi-Detection Microplate Reader Synergy HT, Biotek, Winooski, VT, USA). All measurements were made on three independent replicates and autofluorescence value of the mock construct (containing fcpA promoter alone) was subtracted from the eGFP fluorescence value driven by fcpA, CIP1, GLNA-501, and GLNA-997 promoter constructs.

Statistical analysis
Data were expressed as mean ± SD. Statistical analysis was performed using Student's t test and one-way analysis of variance (ANOVA) followed by Tukey's test for multiple comparison. p value \0.05 was considered statistically significant.

Protein identification
The gels resolved by SDS-PAGE were sliced into five bands (Fig. 1A), and 1836 proteins were totally identified from the bands by LC-MS/MS analysis. The most abundant protein identified from each band is listed as shown in Fig. 1B. The percent coverage of the proteins ranged from 23 to 56% by LC-MS/MS. Of these 5 proteins, ribulose bisphosphate carboxylase (Stevens et al. 1996), fructosebisphosphate aldolase (Pelzer-Reith et al. 1995), and fucoxanthin chlorophyll a/c binding protein E (Apt et al. 1996) have already had their promoters used to construct transformation vectors in diatom species. After considering the identified total spectrum counts and EST databases (Uma Maheswari et al. 2010) of the remaining two, glutamine synthetase (GLNA) and glyceraldehyde-3-phosphate dehydrogenase (GapC1), we selected the GLNA promoter for further experimentation. The LC-MS/MS protein sequence coverage, spectrum, and fragmentation table of GLNA is available in Supplementary Fig. 1.

Plasmid construction and transformation
The pPha-T1 vector, which contains a zeocin resistance gene driven by the fcpB promoter (Zaslavskaia et al. 2000), was used as a backbone for all constructed vectors ( Fig. 2A). The commonly used promoters for genetic manipulations of P. tricornutum are around 500 bp long. The potential promoter regions of GLNA (Uniprot Accession: B7G6Q6), which were 501 and 997 bp long, were extracted by the Biomart tool of Ensembl Protists (Kinsella et al. 2011) (Fig. 2B). The fcpA and CIP1 promoters were used as endogenous and heterologous constitutive promoters that drive reporter protein expression. The pPha-T1 vector containing the fcpA promoter but without the egfp gene was used as a mock construct (Fig. 2C). All prepared constructs were transformed into cells grown to stationary phase. Transformed cells were selected on an f/2 agar selection plate containing 100 lg/mL zeocin for 4 weeks, and after that, all successfully transformed, resistant colonies were transferred to a liquid f/2 selection medium containing 100 lg/mL zeocin. Cells grown on liquid f/2 media were further selected by genomic PCR analysis of whether foreign DNA was integrated (data not shown) and by eGFP fluorescence, indicating whether the reporter protein was expressed.

eGFP expression
A colony selected from each construct was cultured in f/2 media, supplemented with 100 lg/mL zeocin and mixed antibiotics, until stationary phase. The cell density and eGFP expression of each cell culture were checked daily (Fig. 3). Cells were seeded with a concentration of 10 5 cells/mL at day 0 and grown to *10 7 cells/mL at day 10. According to the cell growth curve, days 6 and 10 were the mid-log phase and stationary phase, respectively. The fcpA promoter-driven construct reached its highest eGFP expression level, which was 166.3 ± 6.0 relative eGFP fluorescence, at the log phase, and the expression decreased to 35.0 ± 2.5 relative eGFP fluorescence at the stationary phase (day 11) (Fig. 3A). The CIP1-driven construct did not show the predicted increase in expression between the lag and the stationary phase; the expression was 12.3 ± 0.6 relative eGFP fluorescence on day 6 and 16.0 ± 0.6 relative eGFP fluorescence on day 11 (Fig. 3B). Interestingly, the eGFP expression level of the GLNA-501-driven construct reached 44.3 ± 0.6 relative eGFP fluorescence during the early log phase and 156.0 ± 8.1 relative eGFP fluorescence during the stationary phase (day 11) (Fig. 3C). The GLNA-997-driven construct also showed constitutive eGFP expression, which was 72.7 ± 1.2 relative eGFP fluorescence during the middle log phase and 197.3 ± 1.7 relative eGFP fluorescence during the stationary phase (day 11) (Fig. 3D). Thus, the eGFP expression of the GLNA-501 and GLNA-997-driven constructs was less than 50% compared to that of the fcpA promoter during mid-log phase (Fig. 3E). However, the GLNA-501 and GLNA-997-driven constructs of the eGFP expression were 4 and 5 times higher, respectively, than that of the fcpA promoter during the stationary phase (Fig. 3F). We did not observe the predicted result from the CIP1-driven construct (Kadono et al. 2015) in these culture conditions. The eGFP expression level of the CIP1-driven construct was lower than that of all other promoters and increased slightly throughout stationary phase (Fig. 3B).

Discussion
One of the most important genetic engineering tools for foreign gene expression and metabolic engineering is a strong constitutive promoter that is able to express large  amounts of proteins inside or outside of (i.e., secreted from) the host organism. To maximize productivity, it is preferable to have the highest level of protein expression during the stationary phase of cell culture. The activity of the fcpA and CIP1 promoters has also been evaluated during the stationary phase (Apt et al. 1996;Kadono et al. 2015). In this study, therefore, the search for novel candidate promoters focused on proteins that are highly expressed during the stationary phase. Thus, after evaluating the protein expression of P. tricornutum during the stationary phase, we examined the most highly expressed proteins (Fig. 1), such as GLNA and GapC1. Given its abundant protein expression and EST database information, GLNA was chosen as a candidate protein that is regulated by a strong constitutive promoter.
The fcpA promoter-harboring construct showed the expected pattern of eGFP expression. Its eGFP expression level increased from the lag to the log phase and decreased from the log to the stationary phase, resulting in a similar expression level to the CIP1-promoter-harboring construct during the early stationary phase (Fig. 3A, B). However, in its original description, the CIP1-promoter-harboring construct expressed about three times more reporter protein than the fcpA promoter-harboring construct during the stationary phase (Kadono et al. 2015). We suspect that different experimental conditions, especially light intensity, could be the main factor for the reduction of the CIP1 promoter activity, so we changed the light intensity from 1600 lx to 3000 lx during cultivation. At the result, the ratio of the expression of eGFP from the two promoters (3:1, CIP1:fcpA) was similar to the reference (Kadono et al. 2015), which implies that CIP1 could be a light-inducible promoter ( Supplementary Fig. 2). The eGFP expression driven by the GLNA promoters was linearly related to cell density, and the reporter protein was expressed constitutively during all cell cultivation phases. The eGFP expression driven by the GLNA promoter regions was at least four times greater than that driven by the fcpA promoter during the stationary phase (day 11) (Fig. 3F). This is higher still than the eGFP expression driven by CIP1 under ideal conditions, which was only three times greater than that of the fcpA promoter (Kadono et al. 2015). Although the eGFP expression driven by the GLNA-997 promoter region was around 30% greater than that driven by the GLNA-501 promoter region, GLNA promoter region of at least 501 bp was able to express eGFP consistently in these culture conditions.
In this study, the GLNA promoter caused strong constitutive expression of the reporter protein, regardless of the cell cultivation phase. GLNA promoter region over 501 bp could be used as genetic manipulation tool for foreign gene expression or metabolic engineering of P. tricornutum. Further study is needed to better understand the function of the GLNA promoter and to express valuable recombinant proteins in P. tricornutum using the GLNA promoter.