Construction of a High-Expression System in Bacillus through Transcriptomic Profiling and Promoter Engineering

Bacillus subtilis is an ideal host for secretion and expression of foreign proteins. The promoter is one of the most important elements to facilitate the high-level production of recombinant protein. To expand the repertoire of strong promoters for biotechnological applications in Bacillus species, 14 highly transcribed genes based on transcriptome profiling of B. pumilus BA06 were selected and evaluated for their promoter strength in B. subtilis. Consequently, a strong promoter P2069 was obtained, which could drive the genes encoding alkaline protease (aprE) and green fluorescent protein (GFP) to express more efficiency by an increase of 3.65-fold and 18.40-fold in comparison with the control promoter (PaprE), respectively. Further, promoter engineering was applied to P2069, leading to a mutation promoter (P2069M) that could increase GFP expression by 3.67-fold over the wild-type promoter (P2069). Moreover, the IPTG-inducible expression systems were constructed using the lac operon based on the strong promoters of P2069 and P2069M, which could work well both in B. subtilis and B. pumilus. In this study, highly efficient expression system for Bacillus was constructed based on transcriptome data and promoter engineering, which provide not only a new option for recombinant expression in B. subtilis, but also novel genetic tool for B. pumilus.


Introduction
Bacillus subtilis, a rod-shaped Gram-positive soil bacterium, has been used as a model microorganism both in basic research and biotechnological applications for more than a century [1][2][3][4]. It can produce and secrete abundant industrial proteins [5]. Therefore, B. subtilis is developed as an expression host for the production of recombinant proteins [6][7][8][9]. The major advantages of B. subtilis over the other gene expression hosts include non-biased codon usage, high-cell-density fermentation and protein secretion capability [6][7][8][9][10][11]. Along with availability of the complete genome sequence of B. subtilis [12], a number of genetic tools, including various expression vectors, promoters, regulatory elements, and signal peptides, have been developed and characterized [13][14][15][16]. Although a variety of heterologous genes have been expressed successfully in B. subtilis, the plasmid instability and lower protein production hindered its application [17]. Thus, efficient production of high-value recombinant proteins in B. subtilis remains a major challenge. In order to obtain large amounts of recombinant proteins, it is an important way to use high-copy plasmids and strong promoters. Promoters are important regulatory elements, which facilitate high-level gene expression and recombinant protein production [6,8]. Several approaches have been used to identify novel promoters, including screening

Construction of B. subtilis Expression Vectors
All of the vectors used in this study are listed in Table 1. The primers used to construct each vector were listed in Table S1. The vector pSU03-AP allows to secretory expression of the alkaline protease (AprE) from B. pumilus BA06 in Bacillus [38]. pMUTIN4 was used to provide the lacI gene [39]. pGEM-T-HSA (Catalog No. HG10968-G) were purchased from Sino Biologic, Inc. (Beijing, China), which hosted the full cDNA encoding the human serum albumin (HSA). Amp r lacZ ori lacI [39] pGEM-T-HSA Amp r , ori, f1ori, HSA Sino Biologic, Inc. pSU03-P xxxx -AP * Amp r , kan r , ori, DSO, SSO, rep, aprE This work pSU03-P AprE -GFP Amp r lacZ kan r ori DSO SSO rep GFP This work pSU03-P 2069 -GFP Amp r lacZ kan r ori DSO SSO rep GFP This work pSU03-P 2069X -GFP ** Amp r lacZ kan r ori DSO SSO rep GFP This work pSU03-P 2069M -GFP Amp r lacZ kan r ori DSO SSO rep GFP This work pSU03-P 43

-GFP
Amp r lacZ kan r ori DSO SSO rep GFP This work pSU03-P 2069 -lacI-GFP Amp r lacZ kan r ori DSO SSO lacI rep GFP This work pSU03-P 2069M -lacI-GFP Amp r lacZ kan r ori DSO SSO lacI rep GFP This work pSU03-P 2069 -AP Amp r lacZ kan r ori DSO SSO rep DHAP This work pSU03-P 2069M -AP Amp r lacZ kan r ori DSO SSO rep DHAP This work pSU03-P 2069 -HSA Amp r lacZ kan r ori DSO SSO rep HSA This work pSU03-P 2069M -HSA Amp r lacZ kan r ori DSO SSO rep HSA This work *: P xxxx represents the promoter of different genes; **: P 2069 X represents the mutation promoter derived from P 2069 . cat r , chloramphenicol-resistant gene; Amp r : ampicillin-resistant gene; kan r , kanamycin-resistant gene; rep, replication initiation gene; SSO, single-strand origin; DSO, double strand origin.
In order to screen out strong promoters, 14 gene promoters were selected based on their transcriptional levels in B. pumilus BA06 (Table S2). The promoter sequences were amplified by PCR with 2× Phanta Max Master Mix (Vazyme Biotech Co., Ltd., Nanjing, China) and the corresponding primers (Table S1) and using genomic DNA of B. pumilus BA06 as template. Meanwhile, the vector skeleton was also amplified using pSU03-AP as template with the primers (P18M13-R/PAPSta-F). The PCR cycles included an initial denaturation at 95 • C for 3 min, followed by 30 cycles at 95 • C for 15 s, 56 • C for 15 s and 72 • C for 30 s to 5 min dependent on the target length and a final extension at 72 • C for 5 min. Then, both the vector DNA fragment and promoter fragments were purified with Gel Recovery Kit (Omega Bio-tek, Inc., Norcross, GA, USA). The recombinant cloning method was used to clone the promoter fragment into the vector pSU03-AP with the overlapped primers at 5 -end ( Figure S1). The recombination reaction was performed following the guide of the Clonexpress ® II One Step Cloning Kit (Vazyme Biotech Co., Ltd., Nanjing, China). The recombinant product was directly transformed into E. coli DH5α. Hence, the expression vectors named as pSU03-P xxxx -AP were successfully obtained after confirmed by DNA sequencing.
To replace the aprE gene in pSU03-P xxxx -AP, the green fluorescent protein (GFP) gene and human serum albumin (HSA) cDNA were amplified by PCR using primers (PGFP.F/PGFP.R, PHSA.F/PHSA.R), respectively. The vector fragment was also amplified by using primers (PdeAP-F/PdeAP-R) and pSU03-P 2069 -AP as template. The recombination cloning was performed as described above, resulting in pSU03-P 2069 -GFP and pSU03-P 2069 -HSA. To compare promoter activity of P 2069 with the promoter P 43 which has been recognized as a strong promoter commonly used in the recombinant expression system of B. subtilis [6], the P 43 promoter sequence was amplified using the genomic DNA of B. subtilis WB600 as template and with the primers (P43.F/P43.R) and inserted into pSU03-P2069-GFP to replace P2069 sequence to generate the vector pSU03-P 43 -GFP by the recombinant cloning as described above.
Inducible and strong promoter is routinely applied to recombinant protein expression. Therefore, lacO operon sequence from the plasmid pMUTIN4 [39] was inserted between −10 box and the RBS in the promoter sequence of P 2069 and P 2069M by overlapped PCR using primers (I2069-F 2 /I2069-R 2 ), respectively ( Figure S2A). The PCR reaction contained an initial denaturation at 95 • C for 3 min, 30 cycles with denaturation at 95 • C for 15 s, annealing at 58 • C for 15 s, extension at 72 • C for 5 min. Meanwhile, the lacI gene with the promoter (P enP ) and the terminator sequence from the plasmid pMUTIN4 [38] were also amplified using two pairs of primers (PlacI.F/PlacI.R and PrrnBT-F/PrrnBT-R), respectively. These two fragments were subsequently cloned between the Amp and Kan-resistant genes in pSU03-P 2069 -GFP and pSU03-P 2069M -GFP via recombination cloning ( Figure S2B), respectively, resulting in two expression vectors of pSU03-P 2069 -lacI-GFP and pSU03-P 2069M -lacI-GFP.
All of vectors that were constructed in this work were confirmed by DNA sequencing (Tsingke Biotech Co., Ltd., Chengdu, China).

Engineering Promoter P 2069
Saturation mutagenesis approach was carried out to engineer the promoter P 2069 . A pair of mutagenesis primers (Prapmut-F/Prapmut-R) was designed to mutate two regions between −35 and −10 regions and −10 region and the ribosome binding site (RBS) of P 2069 , respectively. The mutagenesis PCR was performed using pSU03-P 2069 -GFP as template in 50 µL mixture containing 25 µL 2× Phanta Max Master Mix, 2 µL forward and reverse primer and about 100 ng template DNA. The PCR cycles included an initial denaturation at 95 • C for 3 min, 30 cycles of 95 • C for 15 s, 58 • C for 15 s and 72 • C for 5 min and an extension at 72 • C for 10 min. The PCR mixtures were treated with DpnI to remove the template DNA and then transformed into E. coli competent cells as described by Sambrook et al. [40]. All the E. coli transformants were collected from the LB agar plates and mixed plasmid DNAs were isolated as the promoter mutation library after examination of the mutation effect by DNA sequencing.

Transformation of B. subtilis by Chemical Competence
The competent cells of B. subtilis WB600 were prepared as following: 0.3 mL of overnight culture in LB medium was transferred into 10 mL of LB medium and incubated at 37 • C up to OD 600 to about 1.0; 0.3 mL in LB broth was transferred into 10 mL MD medium [0.92 mL 10× PC buffer (600-mM K 2 HPO 4 , 440-mM KH 2 PO 4 , 30-mM C 6 H 5 Na 3 O 7 )], 0.1 mL 5-mg/mL L-Trp, 0.1 mL 50% glucose, 0.005 mL 2.2-mg/mL ferric ammonium citrate, 0.24 mL 0.5-M potassium aspartate, 0.03 mL 1-M MgSO 4 , 8.625 mL ddH 2 O) and incubated at 37 • C till OD 600 to about 1.0. Then, 3 mL competent cells were mixed with 1.0 µg vector DNA in a test tube and further cultivated at 37 • C for 1 h. To continue cultivation, 300 µL 10-times LB medium was added. After incubating at 37 • C with vigor shaking for another 1.5 h, the cells were harvested and plated on LB-agar plates with appropriate antibiotic.

Transformation of B. pumilus by Electroporation
The transformation of B. pumilus BA06 was carried out with reference to the high-osmolarity electroporation protocol for B. subtilis with some modifications [41]. The competent cells of B. pumilus BA06 were prepared as following: 0.5 mL of overnight culture in LB broth was transferred into 50 mL of LB medium (containing 0.5-M sorbitol and 5% betaine) and incubated at 37 • C up to OD 600 to about 1.0; the cells were harvested by centrifugation at 5000 rpm and 4 • C for 10 min. After three washes with 50 mL of ice-cold washing buffer (0.5-M sorbitol, 0.5-M mannitol, 10% glycerol and 7.5% betaine), the cells were resuspended in 1 mL electroporation buffer (0.5-M sorbitol, 1-M mannitol, 10% Glycerol and 7.5% betaine). In the electroporation trials, 80 µL competent cells were mixed with 1 µg plasmid DNA in EP tube and transferred to the precooled L mm electroporation cuvette. Put it into the MicroPulser II (Bio-Rad Co., Hercules, CA, USA) to shock, parameter setting: 2200 V, 25 UF, 200 Ω. Immediately following the electrical discharge, 1 mL recovery medium (LB containing 0.5-M sorbitol and 0.38-M mannitol) were added to the cells. The cells were incubated overnight at 37 • C and spread on LB agar plate with appropriate antibiotics.

Recombinant Protein Expression and Analysis
An overnight culture of the B. subtilis recombinants hosting the designated vectors was inoculated into fresh LB medium containing appropriate antibiotics and incubated at 37 • C by shaking at 200 rpm. At the indicated time points, 1 mL culture was sampled from each flask. The cell density was measured at OD 600 . The extracellular alkaline protease activity was determined using casein as substrate by the Folin-Ciocâlteu method as described before [42]. The caseinolytic reaction was performed at 50 • C for exact 10 min. One activity unit was defined as the enzyme required to release 1 µg tyrosine.
To determine GFP fluorescence intensity, a single colony of the bacterial recombinant containing GFP reporter gene was transferred into 5 mL LB broth and incubated overnight at 37 • C. Then, 0.2 mL culture was transferred into 20 mL of LB in 100-mL flasks containing appropriate antibiotics, which was incubated at 37 • C by shaking at 200 rpm. The cell suspension (100 µL) was sampled at the indicated time points and loaded onto the black OptiPlate-96 F plate (PerkinElmer Co., Ltd., Waltham, MA, USA). The fluorescence intensity (expressed as a.u.) was measured with excitation at 484 and emission at 507 nm using a Synergy H1 microplate reader (BioTek, Winooski, VT, USA). The GFP expression was presented as the fluorescence intensity per OD 600 cells.
For analyzing HSA expression, western blotting analysis was performed. An overnight culture of the B. subtilis recombinants carrying HSA reporter gene was inoculated into 20 mL fresh TSB (17.0 g tryptone, 3.0 g Soybean peptone, 2.5 g C 6 H 12 O 6 , 5.0 g NaCl, 2.5 g K 2 HPO 4 ) containing appropriate antibiotics and shock at 37 • C for 24 h. The culture was harvested by centrifugation at 5000 g for 10 min, then washed with PBS (137-mM NaCl, 2.7-mM KCl, 10-mM Na 2 HPO 4 , 2-mM KH 2 PO 4 , pH 7.4). The cell pellet was resuspended in TSE buffer (10-mM Tris-HCl, pH 8.0, 50-mM NaCl, 1-mM EDTA) containing 100-mg/mL lysozyme and 2-mM PMSF and incubated at 37 • C for 30 min. The cells were disrupted by sonication and the resulting cellular lysate was clarified by centrifugation at 12,000 g for 10 min at 4 • C. Equal amount of protein samples were separated on 10% SDS-PAGE and transferred onto the PVDF membrane. A rabbit polyclonal anti-HSA antibody (Proteintech Biology Co., Ltd., Wuhan, China) was used to identify expression of HSA. The secondary donkey antibody (IgG) coagulated with Alexa Fluor ® 680 (Abcam Trading Co., Ltd., Shanghai, China) was used to detect the primary antibody. The signal of HSA was imaged by Azure C500 instrument (Azure Biosystems, Inc., Dublin, CA, USA).

Statistical Analysis
All of the above experiments were performed in triple. Moreover, the data were represented as an average value with standard deviation. The statistically significant differences between the control and the test group were calculated by two-tailed unpaired Student's t-test using Microsoft Excel. The significant levels were defined with a star for p < 0.05, two stars for p < 0.01, three stars for p < 0.001 and four stars for p < 0.0001.

Selection and Evaluation of the Promoter Strength in B. subtilis
Previously, transcriptome profiling of B. pumilus BA06 was carried out by RNA-seq, which showed more than 96% genes had FPKM values less than 2000. In contrast, a few of genes with extra-high FPKM values (Table S3), suggesting their expression may be controlled under strong promoters. Therefore, 14 genes (cds0544, cds0765, cds0780, cds0812, cds0828, cds1073, cds1075, cds1196, cds1571, cds1927, cds2069, cds2659, cds2646, cds3514) were selected according to their transcriptional levels (Table S2) and used to evaluate their promoter activity in the multiple copy vector in the heterologous B. subtilis cells.
To assess the activity of 14 promoters, their upstream sequences (about 500-600 bp) from the start codon were amplified by PCR and inserted at the immediately upstream of the aprE gene encoding the alkaline protease from B. pumilus BA06 in the expression vector pSU03-AP [38]. The resulting 14 expression vectors were transformed into B. subtilis WB600 and confirmed by colony PCR. Initially, the B. subtilis recombinants hosting different promoters were simultaneously spotted on LB-agar plate containing 1% milk powder. By observing the size of hydrolytic circle around the colony, promoter activity to drive the alkaline protease expression could be preliminarily evaluated. Figure 1A indicated that the native promoter of aprE (positive control) resulted in no proteolytic circle at 24 h. In contrast, the obvious hydrolytic activity could be observed at the same time point for the promoter of cds2069 and cds0544. At 48 h, six recombinant strains (hosting promoter of cds0544, cds0812, cds1073, cds1075, cds2069 and cds2659) produced significantly larger hydrolytic circle than that of the control. To assess the activity of 14 promoters, their upstream sequences (about 500-600 bp) from the start codon were amplified by PCR and inserted at the immediately upstream of the aprE gene encoding the alkaline protease from B. pumilus BA06 in the expression vector pSU03-AP [38]. The resulting 14 expression vectors were transformed into B. subtilis WB600 and confirmed by colony PCR. Initially, the B. subtilis recombinants hosting different promoters were simultaneously spotted on LB-agar plate containing 1% milk powder. By observing the size of hydrolytic circle around the colony, promoter activity to drive the alkaline protease expression could be preliminarily evaluated. Figure 1A indicated that the native promoter of aprE (positive control) resulted in no proteolytic circle at 24 h. In contrast, the obvious hydrolytic activity could be observed at the same time point for the promoter of cds2069 and cds0544. At 48 h, six recombinant strains (hosting promoter of cds0544, cds0812, cds1073, cds1075, cds2069 and cds2659) produced significantly larger hydrolytic circle than that of the control.  Figure 1A-E. Data are averages of three independent experiments with standard deviation. The significant levels were defined with "*" for p < 0.05, "**" for p < 0.01, "***" for p < 0.001 and "****" for p < 0.0001.
In order to confirm the highly efficient promoters, the caseinolytic activities of the 15 recombinants were determined at 48 h after fermentation. Figure 1B shows that promoter activity of the genes cds0544, cds0812, cds1073, cds1075, cds2069 and cds2659 were significantly higher than that of the control promoter PaprE at p < 0.01 with an increase of 30.2%, 84.7%, 55.5%, 51.2%, 260.9%, 49.1%, respectively.
We further compared the cellular growth pattern and dynamic change of the alkaline protease synthesis among the selected 6 recombinants. Figure 1C shows that the growth curves had no obvious difference for various B. subtilis recombinants as well as the control strain. The cell density arrived at peak at 18 h, and then declined in the later. However, the recombinant strains hosting the promoter of cds1073 and cds1075 showed a quicker fall in their cell density in the later growth phase. By contrast, the alkaline protease activity of the recombinant strains was quite different ( Figure 1D). The promoter of cds2069 (P2069) achieved the highest proteolytic activity over the entire growth phase with a proteolytic activity of 757.6 U/mL at the peak, which was 3.65-fold higher than that of the aprE promoter (PaprE).
To simple monitoring recombinant protein production in the following experiments, we used the GFP gene to replace the aprE gene in the expression vectors. Two vectors pSU03-P2069-GFP and pSU03-PaprE-GFP were constructed and transformed into B. subtilis WB600. Figure 1E shows that the fluorescence intensity between these two recombinant strains was significantly different (at least p < 0.001) at each time-point. The promoter P2069 driving GFP expression began at the exponential growth phase and reached a peak at the stationary phase ( Figure 1E). At 60 h, P2069-droven GFP expression level was 18.4-fold higher than that of PaprE, which indicated that the promoter P2069 was a good candidate to express the target gene in B. subtilis.
The B. subtilis promoter P43 is a well-known strong promoter that has been widely used in the recombinant expression system of B. subtilis [6]. Therefore, we wanted to know if the P2069 activity is higher or lower than that of P43. Hence, we constructed another expression vector pSU03-P43-GFP and transformed it into B. subtilis WB600. It was found that P2069 was significantly more active than P43 The significant levels were defined with "*" for p < 0.05, "**" for p < 0.01, "***" for p < 0.001 and "****" for p < 0.0001.
In order to confirm the highly efficient promoters, the caseinolytic activities of the 15 recombinants were determined at 48 h after fermentation. Figure 1B shows that promoter activity of the genes cds0544, cds0812, cds1073, cds1075, cds2069 and cds2659 were significantly higher than that of the control promoter P aprE at p < 0.01 with an increase of 30.2%, 84.7%, 55.5%, 51.2%, 260.9%, 49.1%, respectively.
We further compared the cellular growth pattern and dynamic change of the alkaline protease synthesis among the selected 6 recombinants. Figure 1C shows that the growth curves had no obvious difference for various B. subtilis recombinants as well as the control strain. The cell density arrived at peak at 18 h, and then declined in the later. However, the recombinant strains hosting the promoter of cds1073 and cds1075 showed a quicker fall in their cell density in the later growth phase. By contrast, the alkaline protease activity of the recombinant strains was quite different ( Figure 1D). The promoter of cds2069 (P 2069 ) achieved the highest proteolytic activity over the entire growth phase with a proteolytic activity of 757.6 U/mL at the peak, which was 3.65-fold higher than that of the aprE promoter (P aprE ).
To simple monitoring recombinant protein production in the following experiments, we used the GFP gene to replace the aprE gene in the expression vectors. Two vectors pSU03-P 2069 -GFP and pSU03-P aprE -GFP were constructed and transformed into B. subtilis WB600. Figure 1E shows that the fluorescence intensity between these two recombinant strains was significantly different (at least p < 0.001) at each time-point. The promoter P 2069 driving GFP expression began at the exponential growth phase and reached a peak at the stationary phase ( Figure 1E). At 60 h, P 2069 -droven GFP expression level was 18.4-fold higher than that of P aprE , which indicated that the promoter P 2069 was a good candidate to express the target gene in B. subtilis.
The B. subtilis promoter P 43 is a well-known strong promoter that has been widely used in the recombinant expression system of B. subtilis [6]. Therefore, we wanted to know if the P 2069 activity is higher or lower than that of P 43 . Hence, we constructed another expression vector pSU03-P 43 -GFP and transformed it into B. subtilis WB600. It was found that P 2069 was significantly more active than P 43 when driving GFP expression in B. subtilis ( Figure 1F). The GFP expression level under control by Microorganisms 2020, 8, 1030 8 of 16 P 2069 was 3.48-and 4.48-fold more than that of P 43 at 24 h and at 48 h, respectively. Taken together, the heterologous promoter P 2069 was much better and could be used as strong promoter in the B. subtilis expression system.

Engineering Promoter P 2069
In B. pumilus BA06 genome, cds2069 is annotated to encode the response regulator aspartate phosphatase A (RapA). RapA is a stage 0 sporulation protein, which is controlled by a sigma A-dependent promoter that are common in housekeeping genes and early sporulation genes in B. subtilis [43]. Many promoters, such as P 43 , P yCH and P secA controlled by sigma A-dependent promoters, exhibited strong activity during the exponential growth phase in B. subtilis [44][45][46]. We then analyzed the sequence of P 2069 using software BPROM online (http://softberry.com/berry.phtmL? topic=bprom&group=programs subgroup=gfindb). Figure 2 shows the predicted −35 and −10 elements and the spacer region. However, −10 and −35 boxes are not identical with the prokaryotic conserved "TATAAT" and "TTGACA". The distance between them is 18 bp. In addition, the conserved dinucleotide TG is found at 1 bp upstream of the −10 box.
The conservative elements of promoter, such as −10 and −35 regions, are in general important to maintain the startup function, which cannot be changed. However, the upstream and downstream regions of these conservative sequence may modulate the promoter strength [47]. Therefore, promoter engineering was performed to further enhance the P 2069 strength by random mutagenesis of two regions between −10 and −35 regions and −10 box and the RBS, which were underlined with the red line ( Figure 2). The mutagenesis PCR was performed using pSU03-P 2069 -GFP as template to generate a mutation library containing about 11,000 clones by transforming into E. coli DH5α. The mutation regions were confirmed by DNA sequencing. when driving GFP expression in B. subtilis ( Figure 1F). The GFP expression level under control by P2069 was 3.48-and 4.48-fold more than that of P43 at 24 h and at 48 h, respectively. Taken together, the heterologous promoter P2069 was much better and could be used as strong promoter in the B. subtilis expression system.

Engineering Promoter P2069
In B. pumilus BA06 genome, cds2069 is annotated to encode the response regulator aspartate phosphatase A (RapA). RapA is a stage 0 sporulation protein, which is controlled by a sigma Adependent promoter that are common in housekeeping genes and early sporulation genes in B. subtilis [43]. Many promoters, such as P43, PyCH and PsecA controlled by sigma A-dependent promoters, exhibited strong activity during the exponential growth phase in B. subtilis [44][45][46]. We then analyzed the sequence of P2069 using software BPROM online (http://softberry.com/berry.phtmL?topic=bprom&group=programs subgroup=gfindb). Figure 2 shows the predicted −35 and −10 elements and the spacer region. However, −10 and −35 boxes are not identical with the prokaryotic conserved "TATAAT" and "TTGACA". The distance between them is 18 bp. In addition, the conserved dinucleotide TG is found at 1 bp upstream of the −10 box.
The conservative elements of promoter, such as −10 and −35 regions, are in general important to maintain the startup function, which cannot be changed. However, the upstream and downstream regions of these conservative sequence may modulate the promoter strength [47]. Therefore, promoter engineering was performed to further enhance the P2069 strength by random mutagenesis of two regions between −10 and −35 regions and −10 box and the RBS, which were underlined with the red line ( Figure 2). The mutagenesis PCR was performed using pSU03-P2069-GFP as template to generate a mutation library containing about 11,000 clones by transforming into E. coli DH5α. The mutation regions were confirmed by DNA sequencing. The plasmid DNAs isolated from the mutation library was then transformed into B. subtilis WB600, and more than 10,000 transformants were obtained by several independent transformation experiments. On the LB agar plates, the transformants exhibited different fluorescent intensities, which allowed us to pick up the colonies with higher fluorescence than the control (native P2069 promoter). Totally, we picked up 342 clones with higher green fluorescence intensity than the wildtype promoter based on plate screening, which were then inoculated in 96-well plates for further screening through monitoring the fluorescence intensity. Finally, 23 clones with obviously increased fluorescence intensity were obtained. We recovered the plasmid DNA from the 23 clones as well as 8 clones with lower fluorescence and their nucleotide sequences were determined by DNA sequencing. By grouping the identical mutation sequences, 12 mutation promoters with increased fluorescence intensity were obtained. Figure 3A shows the altered nucleotides in the selected promoter mutants. The plasmid DNAs isolated from the mutation library was then transformed into B. subtilis WB600, and more than 10,000 transformants were obtained by several independent transformation experiments. On the LB agar plates, the transformants exhibited different fluorescent intensities, which allowed us to pick up the colonies with higher fluorescence than the control (native P 2069 promoter). Totally, we picked up 342 clones with higher green fluorescence intensity than the wild-type promoter based on plate screening, which were then inoculated in 96-well plates for further screening through monitoring the fluorescence intensity. Finally, 23 clones with obviously increased fluorescence intensity were obtained. We recovered the plasmid DNA from the 23 clones as well as 8 clones with lower fluorescence and their nucleotide sequences were determined by DNA sequencing. By grouping the identical mutation sequences, 12 mutation promoters with increased fluorescence intensity were obtained. Figure 3A shows the altered nucleotides in the selected promoter mutants. Their promoter strengths were across a wide range over two orders of magnitude in B. subtilis, indicating that change of the interregions between the conservative promoter elements could produce a profound influence on the promoter activity. By the way, the promoter activity of these mutants was determined in E. coli, which also exhibited different strength ( Figure 3A). However, the promoter activity for a given mutant was in general not consistent between B. subtilis and E. coli.
Finally, the GFP expression driven by the 12 stronger mutation promoters over 60 h-fermentation was monitored, out of which a mutation promoter designed as P 2069M led to the highest fluorescence intensity by 3.67-fold increase in comparison with the wild-type promoter (P 2069 ) at 48 h. Their promoter strengths were across a wide range over two orders of magnitude in B. subtilis, indicating that change of the interregions between the conservative promoter elements could produce a profound influence on the promoter activity. By the way, the promoter activity of these mutants was determined in E. coli, which also exhibited different strength ( Figure 3A). However, the promoter activity for a given mutant was in general not consistent between B. subtilis and E. coli. Finally, the GFP expression driven by the 12 stronger mutation promoters over 60 hfermentation was monitored, out of which a mutation promoter designed as P2069M led to the highest fluorescence intensity by 3.67-fold increase in comparison with the wild-type promoter (P2069) at 48 h.

Test of the P 2069M Promoter to Drive Gene Expression
The aprE and HSA genes were chosen to test promoter strength of the mutation promoter P 2069M . Two expression vectors pSU03-P 2069M -AP and pSU03-P 2069M -HSA were constructed and transformed into B. subtilis WB600. Figure 4A shows that the extracellular caseinolytic activity under control by P 2069M was peaked at 36 h by an increase of more than 1-fold than the wild-type P 2069 .
The B. subtilis recombinants carrying HSA gene were incubated in TSB medium for 24 h. The cellular lysates were used to detect the HSA expression level by the western blotting. Figure 4B demonstrated that the recombinant HSA protein was expressed as indicated by a protein band referring to about 66 kD. Further, HSA expression level was obvious higher with the mutant promoter (P 2069M ) in comparison with the wild-type promoter (P 2069 ). These results demonstrated that the promoter P 2069M not only mediated the recombinant GFP expression higher, but also drove higher expression of the aprE and HSA genes in B. subtilis. However, increasing extent of the mutation promoter P 2069M is dependent on the heterologous genes by themselves. promoter activity of the mutation promoters with the native P2069 in driving GFP expression in B. subtilis WB600.

Test of the P2069M Promoter to Drive Gene Expression
The aprE and HSA genes were chosen to test promoter strength of the mutation promoter P2069M. Two expression vectors pSU03-P2069M-AP and pSU03-P2069M-HSA were constructed and transformed into B. subtilis WB600. Figure 4A shows that the extracellular caseinolytic activity under control by P2069M was peaked at 36 h by an increase of more than 1-fold than the wild-type P2069.
The B. subtilis recombinants carrying HSA gene were incubated in TSB medium for 24 h. The cellular lysates were used to detect the HSA expression level by the western blotting. Figure 4B demonstrated that the recombinant HSA protein was expressed as indicated by a protein band referring to about 66 kD. Further, HSA expression level was obvious higher with the mutant promoter (P2069M) in comparison with the wild-type promoter (P2069). These results demonstrated that the promoter P2069M not only mediated the recombinant GFP expression higher, but also drove higher expression of the aprE and HSA genes in B. subtilis. However, increasing extent of the mutation promoter P2069M is dependent on the heterologous genes by themselves.

Construction of Inducible Expression System in Bacillus
Inducible expression is an important strategy in recombinant protein expression, which allow us to control heterologous gene expression at certain stage, especially for production of toxic protein and achievement of high-density fermentation. Therefore, two inducible expression vectors (pSU-P2069-lacI-GFP and pSU-P2069M-lacI-GFP) were constructed by introducing E. coli lacO operon and lacI gene. These two vectors were transformed into B. subtilis WB600 and B. pumilus BA06. The resulting 4 recombinants were incubated in LB broth at 37 °C till OD600~2.0. Then, IPTG was added at a final concentration of 0.9 mM. After inducing expression for 4 h, the fluorescence intensities were quantified. No matter in WB600 or BA06, the fluorescence intensity increased extreme significantly in comparison with the control without IPTG addition with p < 0.0001 ( Figure S3), indicating that our inducible system could work both in B. subtilis and B. pumilus.

Construction of Inducible Expression System in Bacillus
Inducible expression is an important strategy in recombinant protein expression, which allow us to control heterologous gene expression at certain stage, especially for production of toxic protein and achievement of high-density fermentation. Therefore, two inducible expression vectors (pSU-P 2069 -lacI-GFP and pSU-P 2069M -lacI-GFP) were constructed by introducing E. coli lacO operon and lacI gene. These two vectors were transformed into B. subtilis WB600 and B. pumilus BA06. The resulting 4 recombinants were incubated in LB broth at 37 • C till OD 600~2 .0. Then, IPTG was added at a final concentration of 0.9 mM. After inducing expression for 4 h, the fluorescence intensities were quantified. No matter in WB600 or BA06, the fluorescence intensity increased extreme significantly in comparison with the control without IPTG addition with p < 0.0001 ( Figure S3), indicating that our inducible system could work both in B. subtilis and B. pumilus.
In order to determine the optimal inducing concentration and time, various IPTG concentrations from 0.01 mM to 1.5 mM were employed, separately. Figure 5A, B shows that recombinant GFP expression was induced even at 0.01-mM IPTG both in B. subtilis and B. pumilus. However, the expression levels of GFP was quickly increased along the increase of IPTG concentration up to 0.3-0.5 mM, and then maintained steadily with the IPTG up to 1.5 mM. It was noticed that the P 2069M -lacO was more active than the wild-type promoter under our inducing condition. Next, the inducing time was also determined from 1 h to 40 h after IPTG added at a final concentration 1.2 mM. As shown in Figure 5C, the recombinant expressed GFP was accumulated exponentially during the initial 18 h after IPTG supplemented in both strains, and then kept steadily up to 40 h. In addition, inducing expression efficiency was higher for both promoters in BA06 than in WB600, especially at the later inducing stage. In order to determine the optimal inducing concentration and time, various IPTG concentrations from 0.01 mM to 1.5 mM were employed, separately. Figure 5A, B shows that recombinant GFP expression was induced even at 0.01-mM IPTG both in B. subtilis and B. pumilus. However, the expression levels of GFP was quickly increased along the increase of IPTG concentration up to 0.3-0.5 mM, and then maintained steadily with the IPTG up to 1.5 mM. It was noticed that the P2069M-lacO was more active than the wild-type promoter under our inducing condition. Next, the inducing time was also determined from 1 h to 40 h after IPTG added at a final concentration 1.2 mM. As shown in Figure 5C, the recombinant expressed GFP was accumulated exponentially during the initial 18 h after IPTG supplemented in both strains, and then kept steadily up to 40 h. In addition, inducing expression efficiency was higher for both promoters in BA06 than in WB600, especially at the later inducing stage. IPTG is a synthetic analog of lactose, which is usually used as an inducer in recombinant protein expression. In consideration of cost of the commercial fermentation, lactose or galactose may severed as an alternative of IPTG. Therefore, we tested if lactose and galactose could induce GFP expression in our system. Figure 5D shows lactose did not act as inducer; and galactose could induce GFP expression, but so weakly in comparison with IPTG under this condition.

Discussion
RNA sequencing technology makes the genome-wide identification of RNA-based regulatory elements with extremely low background and unprecedented depth [48]. This technology can detect a larger dynamic range of expression levels and accurately quantify transcripts with low expression levels [49]. Therefore, transcriptional analysis based on RNA sequencing is one of the most powerful IPTG is a synthetic analog of lactose, which is usually used as an inducer in recombinant protein expression. In consideration of cost of the commercial fermentation, lactose or galactose may severed as an alternative of IPTG. Therefore, we tested if lactose and galactose could induce GFP expression in our system. Figure 5D shows lactose did not act as inducer; and galactose could induce GFP expression, but so weakly in comparison with IPTG under this condition.

Discussion
RNA sequencing technology makes the genome-wide identification of RNA-based regulatory elements with extremely low background and unprecedented depth [48]. This technology can detect a larger dynamic range of expression levels and accurately quantify transcripts with low expression levels [49]. Therefore, transcriptional analysis based on RNA sequencing is one of the most powerful tools, providing not only important insights into the functional elements of the genome, gene expression patterns and regulation, but also a simpler, more economical and effective method for applied research [50]. For example, several strong promoters were screened based on the transcriptome data and applied to recombinant protein expression in B. subtilis [21,51,52]. Therefore, 14 genes with extra-high expression levels (FPKM values) were selected based on the transcriptome profiling of B. pumilus BA06 and evaluated for their promoter strength in B. subtilis. As a result, a candidate promoter P 2069 was screening out with the strongest activity, which was found to be more active than that of P 43 , a commonly used stronger promoter in the B. subtilis recombinant expression systems. Our results indicated that it is feasible to use the transcriptomic data even from the heterologous species to screen the strong promoter. However, the discrepancy may happen to some gene promoters. For example, the FPKM values of genes cds0765 and cds1196 were high, but their activity to drive the alkaline protease gene expression were lower. It is noticed that some of these genes such as cds0765, cds0780 are sporulation-related, which may by repressed during vegetable growth phase. In view of their performance on the multiple-copy plasmid ( Figure 1B), these genes may not be suitable for recombinant protein expression. In contrast, promoter strength of the genes cds0544 with lower FPKM value achieved higher activity (Table S2). This suggests that there was no strict positive correlation between highly transcriptional level in native cells and strong promoter strength in recombinant expression. Therefore, it is necessary to confirm the promoter strength experimentally in biotechnological applications even if the promoters are derived from the highly transcribed genes.
Promoter engineering is one of the most important strategies for improving the efficiency and productivity of exogenous protein [17]. By doing so, random mutagenesis is an effective method to enhance promoter strength without explicit requirement of extensive knowledge of sequence-to-function mapping. For example, error-prone PCR (ep-PCR) can target both the consensus and non-consensus promoter regions indiscriminately. It is easy to obtain a large dynamic range of promoter function libraries [53], by which a mutation library of the bacteriophage-derived promoter P L-λ was constructed and showed the mutation promoters with a 196-fold dynamic range of expression in E. coli [54]. However, ep-PCR may yield a large number of inactive mutants perhaps due to mutagenesis of critical elements for transcription [17]. In order to solve the limitation of large inactive pools, more targeted methods that utilize molecular understanding of promoter function can be employed. For example, a saturation mutagenesis method was used to specify the spacer region between the consensus −35 and −10 motifs [55]. A randomized linker region between −35 and −10 motifs was generated, which produced a promoter library with a 400-fold dynamic range in Lactococcus lactis [46]. In this study, we have constructed a mutation library of P 2069 by saturation mutagenesis to target two regions between −35 and −10 box and −10 box and the RBS, respectively. Moreover, several mutation promoters were obtained to turn out quite strong. In contrast, some others led to the GFP fluorescence intensity decreased or even disappeared. These results indicate that the spacer context between the consensus sequences could play an important role in modulating the strength of a promoter in prokaryote. Figure 3A shows the mutation sequences of 12 promoter variants with enhanced fluorescence intensity and eight mutation promoters with weaker fluorescence intensity. It was shown that the mutation nucleotides and positions in the promoter variants were quite different. Due to the small number of sequenced mutagenesis promoters, it is difficult to reliably infer the sites or nucleotide changes to have a positive or negative impact on promoter activity.
The promoter P 2069 controls the expression of RapA protein in B. pumilus BA06, which is a 0-stage sporogenesis protein. Based on the sequence analysis of P 2069 , it is likely a sigma A-dependent promoter ( Figure 2). In addition, sigma A-dependent promoters occur commonly in housekeeping genes and early sporogenesis genes in B. subtilis [43]. Since the sigma A-dependent promoter of B. subtilis can also be recognized by sigma70 in E. coli [56]. Therefore, we also analyzed the activity of these mutation promoters in E. coli ( Figure 3A). Compared with that in B. subtilis, the GFP fluorescence intensity in E. coli was much lower, possibly due to difference of the sigma factor-RNA polymerase complexes in the two microorganisms that recognize the promoters with different structures [57]. Therefore, some strong promoters in Bacillus may be relatively weak in E. coli. For example, P 2069M , the strongest promoter screened from B. subtilis, is relatively weaker in E. coli.
The inducible strong promoter is an important factor to achieve high level expression of target gene in B. subtilis. The inducible systems take the advantage that gene expression can be switched on at the proper time during fermentation [58], which is sometimes essential, for example, when the product is toxic to the host cell. Therefore, this work is also devoted to construction of inducible expression system based on our strong promoter. It was shown that our inducting systems worked well in both B. subtilis WB600 and B. pumilus BA06. However, the inducing effect in BA06 is better than that in WB600, especially at the later inducing time as shown in Figure 5C. This may reflect the fact that the promoter P 2069 is come from B. pumilus. It is noticeable that the GFP expression under the control by both promoters P 2069-lacO and P 2069M-lacO was significantly decreased comparative with the native P 2069 and its mutation P 2069M , which may be contributed to the increased length of the spacer region between the −10 box and the RBS in the promoters. Moreover, addition of IPTG may be another reason because IPTG seemed to reduce cellular growth of the bacteria.

Conclusions
In this study, we screened out a strong promoter P 2069 from B. pumilus based on transcriptome data and experimental evaluation, which could drive foreign genes to achieve higher recombinant expression in B. subtilis. Further, a mutation promoter P 2069M was obtained by engineering the wildtype promoter with saturation mutagenesis to target two spacer regions between the −35 and −10 boxes and −10 box and the RBS, respectively. The expression level of GFP under control by P 2069M was further increased by 3.67-fold in comparison with P 2069 . Moreover, the IPTG-inducible expression systems were also constructed based on the strong promoter P 2069 and P 2069M . The expression systems constructed here are not only working well in B. subtilis, but also provide genetic manipulation tool for the non-model Gram-positive B. pumilus. Conclusively, combination of transcriptome-driven strategy and promoter engineering is a practical approach to select stronger promoters for biotechnological application and biosynthetic biology.
Supplementary Materials: The following are available online at http://www.mdpi.com/2076-2607/8/7/1030/s1. Table S1: Primers used to construct the expression vectors and for DNA sequencing; Table S2: Expression levels (FPKM) of the selected genes in B. pumilus BA06; Table S3: Grouped statistics of the gene numbers with various transcriptional levels in the transcriptome profiling of B. pumilus BA06; Figure S1: Scheme to construct expression vectors via the recombinant cloning; Figure S2: Construction of the inducible expression vectors with the lacO system; Figure