A Transcriptional Regulator Sll0794 Regulates Tolerance to Biofuel Ethanol in Photosynthetic Synechocystis sp. PCC 6803*

To improve ethanol production directly from CO2 in photosynthetic cyanobacterial systems, one key issue that needs to be addressed is the low ethanol tolerance of cyanobacterial cells. Our previous proteomic and transcriptomic analyses found that several regulatory proteins were up-regulated by exogenous ethanol in Synechocystis sp. PCC6803. In this study, through tolerance analysis of the gene disruption mutants of the up-regulated regulatory genes, we uncovered that one transcriptional regulator, Sll0794, was related directly to ethanol tolerance in Synechocystis. Using a quantitative iTRAQ-LC-MS/MS proteomics approach coupled with quantitative real-time reverse transcription-PCR (RT-qPCR), we further determined the possible regulatory network of Sll0794. The proteomic analysis showed that in the Δsll0794 mutant grown under ethanol stress a total of 54 and 87 unique proteins were down- and up-regulated, respectively. In addition, electrophoretic mobility shift assays demonstrated that the Sll0794 transcriptional regulator was able to bind directly to the upstream regions of sll1514, slr1512, and slr1838, which encode a 16.6 kDa small heat shock protein, a putative sodium-dependent bicarbonate transporter and a carbon dioxide concentrating mechanism protein CcmK, respectively. The study provided a proteomic description of the putative ethanol-tolerance network regulated by the sll0794 gene, and revealed new insights on the ethanol-tolerance regulatory mechanism in Synechocystis. As the first regulatory protein discovered related to ethanol tolerance, the gene may serve as a valuable target for transcription machinery engineering to further improve ethanol tolerance in Synechocystis. All MS data have been deposited in the ProteomeXchange with identifier PXD001266 (http://proteomecentral.proteomexchange.org/dataset/PXD001266).

Through combined strategies of strain improvement and process optimization, current fermentation production of ethanol employing microbes such as yeast Saccharomyces cerevisiae and bacterium Zymomonas mobilis has reached a very high level of productivity that more than 20% (v/v) of ethanol is produced in industrial yeast fermentation vessels from starch-derived glucose (1). The technology progresses contributed significantly to the recent increase of worldwide ethanol production, from 17.0 billion liters in 2000 to more than 84.6 billion liters in 2011 (2). One of the successful strain improvement strategies was to obtain ethanol-tolerant strains from either mutant selection or more directed metabolic and genome engineering approaches and to apply them directly in industry, as ethanol is known to be highly toxic to cells (3)(4)(5)(6)(7)(8). In addition to engineering an individual gene or enzyme for better tolerance, increasing studies were recently conducted using regulatory genes or proteins as targets for ethanoltolerance improvements, as more evidences suggested that microbes tend to employ multiple resistance mechanisms in dealing with stress of single biofuel product (9,10), and the manipulation of regulatory genes could provide a route to complex phenotypes that are not readily accessible by traditional methods of targeting some number of metabolic genes (11,12). For example, global transcription machinery engineering approach has been applied to S. cerevisiae for improved glucose/ethanol tolerance, and mutagenesis of the transcription factor Spt15p and selection led to dominant mutations that conferred increased tolerance and more efficient glucose conversion to ethanol (13). In another study, screening yeast disruptome against high ethanol concentrations revealed that genes involved in sensing or regulation of amino acid metabolism were required for yeast resistance to ethanol, and can be used to engineer ethanol tolerance (14). More recently, expression of a mutated global regulator gene irrE from an extremely radiation-resistant bacterium, Deinococcus radiodurans, has led to 10-to 100-fold enhancement of Escherichia coli tolerances to ethanol or butanol in shock experiments (15); and error-prone PCR based engineering of a native global transcription factor cAMP receptor protein, which is known to regulate over 400 genes in E. coli, resulted in ethanol-tolerant cAMP receptor protein mutants with a higher growth rate in 62 g/L ethanol and a higher survival rate in 150 g/l ethanol (16). Together, these studies demonstrated that engineering transcriptional machinery could be a very efficient way to increase ethanol tolerance, and eventually ethanol production in industry (17).
As a nonnative platform of ethanol production, autotrophic cyanobacteria have attracted significant attention in recent years, because of the concerns that increasing production of ethanol directly from agricultural crops would require diverting farmland and crops for biofuel production, competing with world food supply, and causing economic and ethical problems (18). By expressing pyruvate decarboxylase (pdc) and alcohol dehydrogenase (adh) of Z. mobilis in cyanobacterium Synechococcus sp. PCC 7942, Deng and Coleman achieved up to 230 mg/L ethanol directly from CO 2 within 4 weeks of growth (19). By constructing a genome-scale Synechocystis sp. PCC 6803 metabolic network and simulating cell growths under various conditions, Fu improved the ethanol production in the cyanobacterial host up to 690 mg/L in a week (20). More recently, by systematic evaluation and selection of adh genes from different cyanobacterial sources and optimization of culturing conditions, Gao et al. obtained an engineered Synechocystis sp. PCC 6803 strain with significantly higher ethanol-producing efficiency of 212 mg/L per day and 5.50 g/L in 26 days, respectively (21). However, to fully realize the ethanol-producing potentials that photosynthetic systems can offer, an urgent issue needs to be addressed is the extremely low tolerance of cyanobacteria to ethanol (22). In our previous investigations, we have found 1.0 -2.0% ethanol was enough to cause growth inhibition of Synechocystis sp. PCC6803 (hereafter Synechocystis) (23). To identify possible targets related to ethanol tolerance, quantitative iTRAQ-LC-MS/MS based proteomics and RNA-seq based transcriptomics were applied to determine the metabolic response of Synechocystis under ethanol stress, and the results showed that multiple transcriptional regulators were differentially regulated, providing potential gene targets for engineering transcriptional machinery in order to improve ethanol tolerance in Synechocystis (23,24). In this work, by constructing gene knockout mutants and conducting phenotypic analyses, we demonstrated that a transcriptional regulator Sll0794 was involved in ethanol tolerance. Further proteomic analysis along with electrophoretic mobility shift assays (EMSAs) 1 allowed the determination of possible Sll0794 regulatory network and the identification of several possible gene targets of Sll0794. The results uncovered that the ethanol-tolerance regulation in Synechocystis may be mediated by direct binding of Sll0794 transcriptional regulator to the upstream regions of sll1514, slr1838, and slr1512, which encode a 16.6 kDa small heat shock protein, a carbon dioxide concentrating mechanism protein CcmK, and a putative sodium-dependent bicarbonate transporter, respectively. As the first transcriptional regulator involved in ethanol tolerance, Sll0794 could be a useful target for further improving ethanol tolerance through transcriptional machinery engineering approach in Synechocystis (13,25).

EXPERIMENTAL PROCEDURES
Bacterial Growth Conditions-Synechocystis sp. PCC 6803 and the knockout mutants constructed in this study were grown in BG11 medium (pH 7.5) under a light intensity of ϳ50 mol photons m Ϫ2 s Ϫ1 in an illuminating incubator of 130 rpm at 30°C (HNY-211B Illuminating Shaker, Honor, China) (23,24). Cell density was measured on a UV-1750 spectrophotometer (Shimadzu, Japan) at OD730 or an ELx808 Absorbance Microplate Reader (BioTek, Winooski, VT) at OD630. For control growth and ethanol (1.5%, v/v) treatment, 10 ml fresh cells at OD 730 of 0.5 collected by centrifugation and then were inoculated into 50 ml BG11 liquid medium in a 250-ml flask. Ethanol of analytical pure was purchased from Merck (Darmstadt, Germany). Growth experiments were repeated at least three times to confirm the growth patterns. Cells for proteomics analysis were collected by centrifugation at 8000 ϫ g for 10 min at 4°C. To reveal cell phenotype difference, flow cytometric analysis was performed to compare the wild type and the mutant cells under ethanol stress on a FACS Calibur fluorescence-activated cell sorting (FACS) cytometer (Becton Dickinson) as described before (23).
Construction and Analysis of Knockout Mutants-A fusion PCR based method was employed for the construction of gene knockout fragments (26). Briefly, for the gene target selected, three sets of primers were designed to amplify a linear DNA fragment containing the chloramphenicol resistance cassette (amplified from a plasmid pACYC184) with two flanking arms of DNA upstream and downstream of the targeted gene. The linear fused PCR amplicon was used directly for transformation into Synechocystis by natural transformation. The chloramphenicol-resistant transformants were obtained, confirmed for the gene knockout event by PCR and sequencing, and then passed several times on fresh BG11 plates supplemented with 10 g/ml chloramphenicol to achieve complete chromosome segregation (confirmed by PCR). Three transcriptional regulator encoding genes, sll0792, sll0794, and sll1423 that were found differentially regulated by ethanol exposure either at protein or RNA levels according to previous studies (23,24), were selected for construction of gene knockout mutants. The successful knockout mutants were con- 1 The abbreviations used are: CCM, CO2-concentrating mechanism; EMSAs, electrophoretic mobility shift assays; FACS, fluorescence-activated cell sorting; HPLC, high-performance liquid chromatography; IPTG, Isopropyl ␤-D-1-thiogalactopyranoside; iTRAQ, Isobaric tag for relative and absolute quantitation; LC-MS, Liquid chromatography-tandem mass spectrometry; MS, mass spectrometry; PCR, polymerase chain reaction; RT-qPCR, quantitative real-time reverse transcription-PCR; SDS-PAGE, sodium dodecyl sulfate polyacrylamide gel electrophoresis; TBE, Tris/Borate/EDTA; TCA, tricarboxylic acid; FSC, forward scatter; SSC, side scatter. firmed by PCR and sequencing analysis. PCR primers for mutant construction and validation were listed in supplemental Table S1. Comparative growth analysis of the wild-type Synechocystis and the mutants were performed in 100-ml flasks each with 10 ml BG11 medium with or without 1.5% (v/v) ethanol. Cultivation conditions are the same as described above. Growth analysis was performed in biological triplicates.
Proteomics Analysis-1) Protein preparation and digestion: for each sample, 10 mg of cells were frozen by liquid nitrogen immediately after centrifugation and washed with phosphate buffer (pH 7.2). The cells were broken with sonication cracking at low temperature, and then centrifuged for 20 min at 20,000 ϫ g to collect the supernatant. In addition, cell debris was then resuspended in a lysis buffer (8 M urea, 4% 3-[(3-cholamidopropyl) dimethylammonio]-1-propanesulfonate (CHAPS), 40 mM Tris-HCl), with 1 mM PMSF, and 2 mM Ethylenediaminetetraacetic acid (EDTA) (final concentration). After 5 min of vigorously vortex, dithiothreitol was also added to a final concentration of 10 mM. After mix, the sample were centrifuged for 20 min at 20,000 ϫ g, and the supernatant was mixed well with ice-cold acetone (1:4, v/v) with 30 mM dithiothreitol . After repeating this step twice, all supernatants were combined and precipitated at Ϫ20°C overnight, and stored at Ϫ80°C prior to sample cleanup if not for immediate use. For digestion, protein pellet from previous step was resuspended in digestion buffer (100 mM triethylammonium bicarbonate TEAB, 0.05% w/v sodium dodecyl sulfate, SDS) to a final concentration of 1 mg/ml (total protein measured by bicinchonic acid assay (Sigma, St. Louis, MO)). Equal aliquots (500 g) from each lysate were then digested with trypsin overnight at 37°C (Sigma; 1:40 w/w added at 0 and 2 h) and lyophilized; 2) iTRAQ Labeling: the iTRAQ labeling of peptide samples derived from each of the wild-type control and the gene knockout mutant samples were performed using iTRAQ Reagent 8-plex kit (Applied Biosystems, Foster City, CA) according to the manufacturer's protocol. Four samples (two biological replicates for the wild-type control and two biological replicates for the mutant, respectively) were iTRAQ individually labeled. The 113-, 114-, 119-, and 121-iTRAG tags are for the wild-type control replicate 1 and 2, and the ⌬sll0794 mutant replicate 1 and 2, respectively. The peptides labeled with respective isobaric tags, incubated for 2 h and vacuum centrifuged to dryness. The iTRAQ labeled peptides in Buffer A (10 mM KH 2 PO 4 , 25% acetonitrile, pH 2.85) were fractionated using PolySULFOETHYL ATM Strong Cation Exchange Choematography (SCX) column (200 ϫ 4.6 mm, 5 m particle size, 200 A°pore size) by HPLC system (Shimadzu, Japan) at flow rate 1.0 ml min-1. The 50 min HPLC gradient consisted of 100% buffer A (10 mM KH 2 PO 4 , 25% acetonitrile, pH 2.85) for 5 min; 0 -20% buffer B (10 mM KH 2 PO 4 , 25% ACN, and 500 mM KCL, pH 3.0) for 15 min; 20 -40% buffer B for 10 min; and 40 -100% buffer B for 5 min followed by 100% buffer A for 10 min. The chromatograms were recorded at 218 nm. A total of twenty fractions of labeled peptides were collected and then individually desalted with Sep-Pak ® Vac C18 cartridges (Waters, Milford, MA), concentrated to dryness using vacuum centrifuge and reconstituted in 0.1% formic acid for LC-MS/MS analysis; 3) LC-MS/MS proteomic analysis: the mass spectroscopy analysis was performed using an AB SCIEX TripleTOF™ 5600 mass spectrometer (AB SCIEX, Framingham, MA), coupled with online micro flow HPLC system (Shimadzu, JAPAN) as described before (23,24). The peptides were separated using nanobored C18 column with a picofrit nanospray tip (75 m ID x 15 cm, 5 m particles) (New Objectives, Wubrun, MA). The separation was performed at a constant flow rate of 20 l min Ϫ1 , with a splitter to get an effective flow rate of 0.2 l min Ϫ1 . The mass spectrometer data acquired in the positive ion mode, with a selected mass range of 300 -2000 m/z. Peptides with ϩ2 to ϩ4 charge states were selected for MS/MS. The three most abundantly charged peptides above a count threshold were selected for MS/MS and dynam-ically excluded for 30 s with Ϯ30 mDa mass tolerance. Smart information-dependent acquisition (IDA) was activated with automatic collision energy and automatic MS/MS accumulation. The fragment intensity multiplier was set to 20 and maximum accumulation time was 2 s. The peak areas of the iTRAQ reporter ions reflect the relative abundance of the proteins in the samples. For peptide identification, Triple TOF 5600 mass spectrometer used in this study has high mass accuracy (less than 2 ppm). Other identification parameters used included: Fragment mass tolerance: Ϯ 0.1 Da; Mass values: monoisotopic; Variable modifications: Gln-Ͼpyro-Glu (N-term Q), Oxidation (M), iTRAQ8plex (Y); Peptide mass tolerance: 0.05 Da; Max missed cleavages: 1; Fixed modifications: Carbamidomethyl (C), iTRAQ8plex (N-term), iTRAQ8plex (K); Other parameters: default. 4) Proteomic data analysis: the MS data were processed using Proteome Discoverer software (Version 1.2.0.208) (Thermo Scientific, Marietta, OH) to generating peak list. The default parameters of Proteome Discoverer software (Version 1.2.0.208) were used. The data acquisition was performed with Analyst QS 2.0 software (Applied Biosystems/MDS SCIEX). Protein identification and quantification were performed using Mascot 2.3.02 (Matrix Science, London, United Kingdom) (23). For iTRAQ quantification, the peptide for quantification was automatically selected by the algorithm to calculate the reporter peak area, error factor (EF) and p value (default parameters in Mascot Software package). The resulting data set was auto bias-corrected to get rid of any variations imparted because of the unequal mixing during combining different labeled samples. Genome sequence and annotation information of Synechocystis sp. PCC 6803 were downloaded from NCBI, the Comprehensive Microbial Resource (CMR) of TIGR (http://www.tigr. org/CMR) and CyanoBase (http://genome.microbedb.jp/cyanobase/) (27). The Synechocystis sp. PCC 6803 genome contains 3569 predicted protein (27). The ratio between the mutant and the wild type was obtained directly based on the protein abundance for any given protein. Proteins with 1.5-fold or more change between ethanol-treated and control samples and p value of statistical evaluation less than 0.05 were determined as differentially expressed proteins. The quantitation was performed at the peptide level by following the procedures described in http://www.matrixscience.com/help/quant_statistics_help.html. The student's t test was performed using the Mascot 2.3.02 software. Briefly, a protein ratio is reported in bold face if it is significantly different from unity. The comparison test is: If this inequality is true, then there is no significant difference at the stated confidence level. (N is the number of peptide ratios, s is the standard deviation and x the mean of the peptide ratios, both numbers calculated in log space. The true value of the ratio, , is 0 in log space. t is student's t for N-1 degrees of freedom and a two-sided confidence level of 95%). To verify the proteomic results, the identical cell samples used for protein isolation as described above were also used for quantitative real-time reverse transcription-PCR (RT-qPCR) analysis as described before (24,28). Three technical replicates were performed for each gene. Then data was presented as ratios of the amount of normalized transcript in the mutant to that from the wildtype control. The gene ID and their related primer sequences used for real-time RT-PCR analysis were listed in supplemental Table S1. All the mass spectrometry proteomics raw data described in this paper have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the data set identifier PXD001266 (http://www.ebi.ac.uk/pride) (29,30). In addition, the MASCOT csv results were also uploaded them to the MS-viewer available though University of California at San Francisco (a public website http://prospector2. ucsf.edu/prospector/cgi-bin/msform.cgi?formϭmsviewer), and can be retrieved using a search key "fvdaozbtdz." Overexpression and Purification of His 6 -Sll0794 Protein-The sll0794 coding gene of Synechocystis was amplified using primer pairs to introduce appropriate 5Ј or 3Ј extensions before its ATG initiation codon and behind its stop codon (supplemental Table S1). The resulting target fragment was treated with T4 DNA Polymerase in the presence of dATP to generate specific vector-complementary overhangs, and then annealed to pET46 Ek/LIC vector treated with the same enzymes, generating the recombinant plasmid pET46 -0794. After verification by DNA sequencing, the pET46 -0794 plasmid was transformed into E. coli BL21 (DE3) for protein expression. E. coli BL21 (DE3) harboring pET46 -0794 was grown at 37°C in 400 ml LB with 100 g/ml ampicillin up to an OD 600 of about 0.6. The cultures were then induced by Isopropyl ␤-D-1-thiogalactopyranoside (IPTG) at a final concentration of 0.2 mM and followed by incubation at 18°C overnight. For the purification of His 6 -Sll0794, the cells were harvested by centrifugation at 5000 ϫ g, 4°C for 10 min, washed twice with binding buffer (50 mM Tris-HCl, 500 mM NaCl, 10 mM imidazole, and 10% glycerol (pH 8.0)), and resuspended in 40 ml of the same buffer. The cell suspension was then lyzed by sonication on ice. After centrifugation (13,000 ϫ g for 40 min at 4°C), the supernatant was recovered and loaded on the Ni-NTA agarose chromatography (GE healthcare, Sweden) equilibrated with binding buffer. After extensive washings with binding buffer and washing buffer (50 mM Tris-HCl, 500 mM NaCl, 50 mM imidazole, and 10% glycerol (pH 8.0)), His 6 -Sll0794 protein was eluted from the resin with 8 ml of elution buffer (50 mM Tris-HCl, 500 mM NaCl, 500 mM imidazole, and 10% glycerol (pH 8.0)) and concentrated to ϳ3 mg/ml by ultrafiltration (Millipore membrane, 10 kDa cut-off size) according to the protocol provided by the manufacturer. The purity of the eluted His 6 -Sll0794 protein was then checked by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE).
Electrophoretic Mobility Shift Assays-The EMSAs were performed as described before (31). The promoter regions of 12 predicted target genes of Sll0794 were amplified using the genomic DNA of Synechocystis and primers listed in supplemental Table S1. Labeling of the probes was performed via PCR using Cy5-labeled primer (5Ј-AGCCAGTGGCGATAAG-3Ј) according to the method described by Tiffert et al. (32). The labeled PCR products were purified by QIAquick PCR Purification Kit (Qiagen, Germany). For each EMSA reaction, about 10 ng of Cy5-labeled DNA probes was incubated with varying amount of His 6 -Sll0794 protein in the buffer containing 1 mg/ml of nonspecific DNA competitor poly(deoxyinosinic-deoxycytidylic) acid (poly (dI-dC)) (Roche), 20 mM Tris-base (pH 7.9), 1 mM dithiothreitol, 10 mM MgCl 2 , 0.2 mg/ml nonspecific protein competitor calf bovine serum albumin (BSA), and 5% glycerol for 20 min at 25°C. After incubation, protein-bound and free DNA were separated by nondenaturing 6% polyacrylamide gels with 0.5ϫTBE running buffer at 10 V cm Ϫ1 and 4°C. The gels were scanned using a FujiFilm FLA 9000 Image Scanner (FUJIFILM, Japan). Probe DNA concentrations were determined using a Nanodrop 2000 (Thermo Scientific) at the wavelength of 260 nm.
Promoter Analysis and Motif Identification-The Gibbs Motif Sampler software (http://ccmbweb.ccv.brown.edu/gibbs/gibbs.html) (33,34) and the Motif-based Sequence Analysis Tool MEME (http:// meme.sdsc.edu/) (35) available online were used to identify matrix models describing DNA sequence motifs present upstream of the responsive genes between the mutant and the wild type. Regions representing ϳ500 base pairs of the DNA sequences upstream of the translational start site of the responsive genes were extracted from the NCBI genome database using the Regulatory Sequence Analysis Tools (RSAT) (36). Both strands of each sequence were searched and possible motif locations were identified using the motif matrix score obtained from the Gibbs Motif Sampler software. The multilevel consensus sequence for each motif was then used to generate a se-quence logo that is a graphical representation of nucleic acid multiple sequence alignment (http://www.weblogo.berkeley.edu/) (37).

Ethanol Tolerance Analysis of the Transcriptional Regulator
Mutants-Survey of the Synechocystis genome revealed at least 40 genes encoding putative transcriptional regulators (27). So far only a dozen of them have ever been functionally characterized and the results showed that they were involved in regulation of a wide-range of physiological functions, such as nitrogen assimilation (38), inorganic carbon starvation and osmotic stress (39), high-affinity bicarbonate transporting (40), biogenesis of photosystem I (41), photosystem stoichiometry in response to high light (42), ferric uptake (43), and heat response (44). However, no transcriptional regulator has ever been reported related to tolerance to solvents or biofuel products. Based on our previous proteomic and transcriptomic analyses of ethanol response (23,24), three of the differentially regulated transcriptional regulator genes, sll0792, sll0794, and sll1423, were selected for construction of knockout mutants and validation of their possible functions in ethanol tolerance. After chromosomal integration and full segregation as confirmed by PCR and sequencing (data not shown), the mutants were grown in parallel with the wild-type Synechocystis in both normal BG11 medium and the BG11 medium supplemented with 1.5% (v/v) ethanol. The concentration of 1.5% ethanol was used as the concentration caused ϳ50% growth inhibition in the wild-type Synechocystis (23,24). Comparative analysis showed that in the normal BG11 medium, no visible difference in terms of growth patterns between the wild-type control and all three mutants was observed. However, in the BG11 medium supplemented with 1.5% ethanol, the ⌬sll0794 mutant led to a decreased growth when compared with the wild-type strain, suggesting the ⌬sll0794 mutant was more sensitive to ethanol and the gene may be involved in ethanol tolerance in Synechocystis (Fig. 1). In addition, a similar decreased growth of the ⌬sll0794 mutant under lower ethanol concentration stress (i.e. 1.0 -1.4%) was also observed, although it is less significant (data not shown). For the ⌬sll0792 and ⌬sll1423 mutants, no obvious difference was observed between the wild-type strain and the mutants in the BG11 medium supplemented with ethanol of varying concentration (data not shown), suggesting these two genes may not be involved directly in ethanol tolerance, and their differential expressions upon ethanol stress in Synechocystis discovered in previous proteomic and transcriptomic analyses could be part of the secondary metabolic responses with the ethanol response cascade (23,24).
Cell morphology of the wild-type Synechocystis and the ⌬sll0794 mutant under ethanol stress was also compared using microscopic and flow cytometric analysis. In the flow cytometric analysis, light intensity scattered at small (0.5-2 degree) angles FSC is determined to a substantial extent by the size of the scattering particles (i.e. cells), whereas light intensity scattered at large (15 to 135 degrees) angles SSC is affected by the composition or complexity of the cells (45). Although no visible difference was observed under microscope between the wild type and the mutant, flow cytometric analysis showed that internal structure complexity of the ⌬sll0794 mutant seemed decreasing (correlated with SSC values), especially at 48 and 72 h after long ethanol exposure (Fig. 2), suggesting different physiological responses to ethanol exposure between the wild type and the mutant.
The sll0794 gene was annotated as a mercuric resistance operon regulatory protein, and was previously found located within a nine-gene cluster involved in sensing and tolerance of Ni 2ϩ , Co 2ϩ , and Zn 2ϩ in Synechocystis; within the cluster, sll0794 (corR) and slr0797 (corT) were together involved in Co 2ϩ response system, as their mutants showed decreased Co 2ϩ tolerance (46). Sequence analysis showed that the Sll0794 protein contains a helix_-turn_-helix (HTH) motif (Evalue 5.6e-30) typically found in DNA binding proteins that regulate gene expression. The HTH motif is situated toward the N terminus at amino acid position 6 -77 of Sll0794. To determine if sll0794 was involved in tolerance to other stress except Co 2ϩ and ethanol, we also evaluated the cell growth of ⌬sll0794 and the wild-type control under several other stress conditions, including 0.25% butanol (v/v) and 4% NaCl (w/v), pH 6.5 and pH 11.0. Each growth experiment was repeated three times, each with three biological replicates. The results showed that no visible difference between the wild-type Synechocystis and the ⌬sll0794 mutant was observed under these tested stress conditions (data not shown), suggesting that the tolerance regulated by Sll0794 may be specific at certain degree rather than a part of general tolerance to all stresses.
Overview of Proteomic Analysis-To determine the possible regulatory network mediated by Sll0794, a quantitative iTRAQ-LC-MSMS proteomics approach was employed. For proteomic analysis, two independent cultivations for both the wild-type control and the ⌬sll0794 mutant were grown in BG11 medium supplemented with 1.5% ethanol, and cells were collected by centrifugation (8000 ϫ g for 10 min at 4°C) at exponential phase (i.e. 36 h) (Fig. 1), resulting two biological replicate samples for the wild-type control and the ⌬sll0794 mutant. The purified protein samples were subjected to the iTRAQ -LC-MS/MS proteomic analysis as described before  (23). After data filtering to eliminate low-scoring spectra, the qualified spectra were matched to 1537 proteins, representing ϳ43% of the 3569 predicted proteins in the Synechocystis genome (supplemental Table S2). All peptides identified and spectra annotated were provided for each protein in supplemental Table S3. Reproducibility of the quantitative proteomic analyses were accessed by two types of comparisons: first we labeled and mixed two biological replicates of either the wild type or the ⌬sl10794 mutant samples directly for proteomic analysis, the difference was plotted verse the percentage of the proteins identified, the results showed that ϳ80% of the proteins with difference less than delta error of 0.2-0.25, and more than 95% of the proteins with difference less than delta error of 0.5 (Fig. 3A-3B); second we labeled and mixed each pair of the ⌬sll0794 mutant and the wild-type samples for proteomic analysis, the difference between different biological pairs was plotted in Fig. 3C-3D. The dispersion of the iTRAQ ratios of the quantified proteins was found with very similar trends between biological replicates, suggesting that the biological noise was reasonably low.
Putative Ethanol-tolerance Network Regulated by Sll0794 -Using a cutoff of 1.5-fold change and a statistical significance p value less than 0.05, we determined that between the wildtype Synechocystis and the ⌬sll0794 mutant grown under 1.5% ethanol, a total of 54 and 87 unique proteins were down-regulated and up-regulated, respectively. A subset of 12 genes encoding the ⌬sll0794-responsive proteins was selected for RT-qPCR validation. Among these genes, six were found up-regulated, and six were down-regulated by the sll0794 gene disruption according to the proteomic analysis. Although slightly greater changes were found based on the RT-qPCR analysis, an obvious positive correlation can still be detected between RT-qPCR and proteomic results for these genes (Table I), suggesting an overall good quality of the proteomic analysis.
Proteins Down-regulated-The ⌬sll0794 mutant was grown poorly under 1.5% ethanol exposure when compared with the wild-type Synechocystis, suggesting that the sll0794 gene disruption may have decreased the expression of the proteins important to ethanol tolerance. Among 54 down-regulated proteins in the ⌬sll0794 mutant, only 28 were functionally annotated (Table II) (27), whereas the rest 26 were annotated as functionally unknown hypothetical proteins (supplemental Table S4). Compared with previous proteomic analysis of Synechocystis upon ethanol stress (23), a smaller number of proteins were found differentially regulated by ethanol in the ⌬sll0794 mutant, probably because of that the transcriptional regulator Sll0794 may have a relatively narrow range of targets with very specific and relevant function.
Among the proteins down-regulated, five that were reported previously related to resistance to environmental stresses: Sll0858 of a zinc resistance-associated protein, Ssr2061 of a glutaredoxin, Sll0273 of a Na ϩ /H ϩ antiporter, FIG. 3. Reproducibility between biological replicates. Wild-type A, and ⌬sll0794 mutant B, respectively; Distribution of iTRAQ log ratios of the detected proteins between ⌬sll0794 biological replicate 1 and the wild-type C, and between ⌬sll0794 biological replicate 2 and the wild-type D.
Slr1829 of a putative poly(3-hydroxyalkanoate) synthase and Sll1980 of thiol/disulfide interchange protein TrxA, were found down-regulated in the ⌬sll0794 mutant (Table II). Early study revealed that sll0794 gene is located within a cluster involved in sensing and tolerance of Ni 2ϩ , Co 2ϩ , and Zn 2ϩ in Synechocystis (44), so it is not surprising that Sll0858 of a zinc resistance-associated protein was down-regulated in the mutant. Glutaredoxins are small ubiquitous glutathione-disulfide oxidoreductases that reduce disulfide bonds of target pro-teins and maintain the redox homoeostasis of Synechocystis cells, and disruption of a glutaredoxin coding gene ssr2061 in Synechocystis has been reported with reduced cell viability against oxidative stress (47). The sll0273 gene encoding a putative Na ϩ /H ϩ antiporter was found essential for growth at low Na ϩ /K ϩ ratios (48), and its mutant showed low tolerance to elevated pH in the BG11 medium (49). In our previous study, we found that the slr1829 gene encoding a putative poly(3-hydroxyalkanoate) synthase was inducted by butanol,  Ϫ1.628 55.7 Glutaredoxin and the enhanced production of carbon storage compound polyhydroxyalkanoates (PHAs) was one important mechanism for Synechocystis to combat against butanol stress (50). The down-regulation of the same Slr1829 in the ⌬sll0794 mutant confirmed the previous results that the PHA synthesis could be important for the protection of ethanol toxicity in Synechocystis. Cyanobacterial thioredoxins (Trx) constitute a family of redox-active enzymes that catalyze the reduction of a variety of target enzymes, including peroxiredoxins, by disulfide/dithiol exchange. The Synechocystis genome contains several Trx, among which TrxA (slr0623) was found to be essential, whereas others, such as TrxQ, may be important in peroxide tolerance (51). Although exact function of Sll1980 (TrxA) in either cell survival or detoxification has not been fully defined, its involvement in ethanol tolerance may worth further investigation. Interestingly, two proteins involved in urea metabolism, Sll0643 of a urease accessory protein G and Sll1750 of a urease alpha subunit were down-regulated in the ⌬sll0794 mutant (Table II). A previous proteomic study revealed that a common stress response of Synechocystis under various environmental stress conditions was the activation of atypical pathways for the acquisition of carbon and nitrogen from urea and arginine, as evidenced by the significant up-regulation of urease that converts urea into CO 2 and ammonia, under most conditions (52), and our recent metabolomic analysis of Synechocystis under butanol stress showed that increasing concentration of urea was detected (50). The finding of downregulation of Sll0643 and Sll1750 was consistent with these early results, suggesting that down-regulation of the urea metabolism proteins may be related to the decreased ethanol tolerance in the ⌬sll0794 mutant.
Slr0646 of a probable D-alanyl-D-alanine carboxypeptidase was down-regulated in the ⌬sll0794 mutant. The enzyme catalyzes the cross-links the peptidoglycan chains to form rigid cell walls, and a previous study has found that a single transposon insertion of Tn5-B21 in d-alanyl-d-alanine carboxypeptidase gene resulted in the altered tolerance to both ionic and osmotic stresses in a Ochrobactrum sp. strain (53). Two proteins related to bacterial two-component signal transduction, Sll1871 of a HepK two-component system sensory histidine kinase and Slr0115 of a response regulator, were down-regulated in the ⌬sll0794 mutant. In the previous study, it has been found that Slr0115 regulates the accumulation of the monomeric photosystem I and the D1 protein under high light stress conditions (54).
Proteins Up-regulated-Among 87 up-regulated proteins in the ⌬sll0794 mutant, 62 were functionally annotated (Table III), whereas the rest 25 were annotated as functionally unknown hypothetical proteins (supplemental Table S4). Detailed analysis of the up-regulated proteins showed a typical metabolic response similar to that of Synechocystis under various environmental stresses (52): (1) although the ⌬sll0794 mutant was grown poorly under 1.5% ethanol compared with the wild type, significant up-regulation of proteins involved protein biosynthesis and fate was observed, including 18 ribosomal proteins, a 16S rRNA processing protein RimM homolog (Ssr0808), an initiation factor IF-1 (Ssl3441), and two periplasmic protease HhoA (Sll1097, Sll1679) involved in house-keeping processes related to protein (re)folding, protein clearance and signaling (52,55); (2) up-regulation of nine proteins involved in energy metabolism, including NADH dehydrogenase subunit 2 (Sll0223), putative thioredoxin-m (Sll0685), ironsulfur cluster binding protein homolog (Slr2059), probable ferredoxin (Ssl3044) and heme oxygenase (Sll1184), citrate synthase (Sll0401) and succinyl-CoA synthetase alpha chain (Sll1557) of TCA cycle, photosystem I assembly related protein (Sll0226) and carbon dioxide concentrating mechanism protein CcmK homolog 3 (Slr1838) of photosynthesis and carbon fixation, and two bicarbonate transporters (Slr0040, Slr1512). Up-regulated of protein biosynthesis and energy metabolism are consistent with previous proteomic analysis of Synechocystis under various environmental stresses (51); (3) strong up-regulation in all replicates of 16.6 kDa small heat shock protein (Hsp17), which was involved in membrane fluidity of Synechocystis under stress conditions (56). A hsp17 homologous gene was also found inducible by ethanol in Leuconostoc oenos (57); (4) up-regulation of 12 regulatory proteins, most of which were found responsive to stress conditions, such as two group 2 sigma factor (Sll0184, Sll0306) (SigC, SigB) involved in acclimation to low inorganic carbon at high temperature and salt acclimation responses (58,59), two-component sensor histidine kinase Sll0698 responsive to low temperature, antisigma B factor antagonist Slr1856 whose predicted product resembles the stress-responsive protein kinase SpoIIAA in Bacillus subtilis (60,61). Although the functions are still unknown, it worth noting that Slll1672 and Sll1673 of a two-component hybrid sensor and regulator, and a response regulator were both up-regulated in our experiments, implying their possible functions in ethanol response in the ⌬sll0794 mutant.
Modification of cell envelope represents an important mechanism against stress in various microbes (62)(63)(64). In the ⌬sll0794 mutant grown under ethanol stress, three proteins involved in cell envelope, Slr0298 of a FraH protein homolog, Sll0016 of a probable membrane-bound lytic transglycosylase A, and Sll0721 of an S-layer-RTX protein-related, were upregulated (Table III). Previous studies found that Fra protein influenced filament integrity, diazotrophy and localization of septal protein SepJ in the heterocyst-forming Anabaena (65), lytic transglycosylases involved in cell wall turnover in Salmonella enterica (66), and S-layer-RTX protein (Sll0721) induced by ethanol stress in Synechocystis (23).
Identification of the Direct Regulatory Targets of Sll0794 -To identify the direct targets of Sll0794 transcriptional regulator, we conducted an EMSA analysis using the purified His 6 -Sll0794 and putative promoter regions of 12 coding genes whose corresponding proteins were differentially reg-  (Table II, III). The sll0794 gene was expressed in E. coli and his-tagged protein was purified to Ͼ90% pure as judged by SDS-PAGE (Fig. 4A). These putative gene targets were selected based on the following considerations: highly responsive in the mutant with large fold changes or responsive in most of the replicate proteomic measurements; clues that homologous genes were involved in stress responses, and large intergenic region (Ͼ100 bp) before the open reading frames with the high possibility of an independent promoter. The six genes encoding down-regulated proteins selected for EMSA investigation were sll0273 of a Na ϩ /H ϩ antiporter, sll0643 of a urease accessory protein G, sll1871 of HepK two-component system histidine kinase, slr0115 of response regulator, slr1829 of a putative poly(3-hydroxyalkanoate) synthase, and ssr2061 of glutaredoxin; and the six genes encoding up-regulated proteins selected for EMSA investigation were sll0685 of putative thioredoxin, sll1514 of a 16.6 kDa small heat shock protein (Hsp17), slr0040 of a bicarbonate transport system substrate-binding protein, slr0808 of 16S rRNA processing protein RimM homolog, slr1512 of putative sodium-dependent bicarbonate transporter, and slr1838 of carbon dioxide concentrating mechanism protein CcmK homolog 3. In the EMSAs, probes containing the corresponding upstream regions of selected genes were PCR amplified and labeled for binding evaluation (supplemental Table S1). As shown in Fig. 4B, clear gel-shift pattern of the purified His 6 -Sll0794 protein was observed for sll1514, slr1512, and slr1838, whereas no direct binding was observed for the rest of the nine genes under the testing condition (i.e. sll0273, sll0643, sll1871, slr0115, slr1829, ssr2061, sll0685, slr0040, and slr0808), nonbinding of His 6 -Sll0794 protein and putative promoter of ssr2061 was presented as a negative control in Fig. 4B. All three genes showing possible binding with His 6 -Sll0794 were up-regulated in the ⌬sll0794 mutant grown under ethanol stress, suggesting that the Sll0794 transcriptional regulator could function negatively on the expression of these target genes. Among the possible targets regulated by Sll0794, small heat shock protein encoding gene sll1514 was previously found involved in response to many stress conditions, such as osmostress (67), salt (68), acid (69), ethanol (23), and butanol stress (70). It was believed that the small heat protein can prevent irreversible protein aggregation and to facilitate subsequent protein renaturation in cooperation with ATP-dependent chaperones (71). The biochemical study using both immunoprecipitation and affinity chromatography has found that the Sll1514 protein can potentially interact with at least 42 different proteins in vivo specifically during heat treatment, suggesting its diverse cellular functions (70). In addition, early studies have found that sll1514 was under direct regulation by a histidine kinase, Hik34, for its roles in salt responses (67); and exhibited substantially higher expression in the ⌬Hik31 mutant than the wild-type strain under both aerobic and low oxygen conditions (72). However, its regulation by transcriptional regulator was not reported before in Synechocystis.
Early studies have found that Synechocystis possesses four modes of inducible inorganic carbon uptake mechanisms (i.e. ndhF, sbtA, cmp, and ccm systems) (73). The first mechanism involves NADPH dehydrogenase (NDH-1) complexes that are crucial for dark respiration and cyclic electron transport. The gene cluster ndhF3-ndhD3-orf427 was first identified in Synechococcus PCC7002 as being necessary for inducible, highaffinity CO 2 uptake (73,74); and then was shown to be essential for inducible, high-affinity CO 2 uptake in Synechocystis (73,75,76). The transcription of the ndh operon was regulated by a transcriptional regulator ndhR under low inorganic carbon condition in Synechocystis (77). The second mechanism involves sbtA gene (slr1512) encoding a putative sodium-dependent bicarbonate transporter (78). Early study found that slr1512 was essential to Na ϩ -dependent HCO 3 Ϫ transport and may play a crucial role in carbon acquisition when CO 2 supply is limited, the ⌬slr1512 mutant was unable to take up HCO 3 Ϫ and grow in low CO 2 at pH 9.0 (78); however, its regulation mechanism is unknown. The third mechanism also involves a HCO 3 Ϫ transport system, the cmp operon, which was shown to be low CO 2 inducible, and under regulation by a LysR family transcriptional regulator CmpR (79). The forth mechanism involves CO 2 -concentrating mechanism (CCM) that is maximally expressed under inorganic carbon limitation (79), which functions to elevate the intracellular concentration of CO 2 in the vicinity of Rubisco to compensate for the poor kinetic efficiency of this enzyme (79 -82). In green alga Chlamydomonas reinhardtii, CCM mechanism was regulated by a nuclear regulatory gene, Ccm1, encoding a hydrophilic protein of 699 amino acids with a putative zinc-finger motif in its N-terminal region and a Gln repeat characteristic of transcriptional activators (82). Interestingly, two other putative targets of Sll0794 identified by EMSAs, slr1512 (sbtA) and slr1838 encoding a carbon dioxide concentrating mechanism protein CcmK, are the key components of the second and the forth CO 2 acquisition mechanisms, respectively (Fig. 4B), suggesting that the CO 2 acquisition/transport could be related to ethanol tolerance in Synechocystis, and their expression could be regulated directly by transcriptional regulator Sll0794.
To further define the possible DNA-binding motifs related to Sll0794, we performed a promoter DNA-binding motif searching using 500 bp sequences extracted from upstream region of the up-regulated genes including sll1514, slr1512, and slr1838 using both the Gibbs Motif Sampler and the MEME software (33)(34)(35). Three top conversed motifs containing 12-15 total sites were identified (Fig. 5) (37). The results showed that the motif 1 "XGGCGATCGCCX" was identified upstream of all three putative target genes, whereas motif 2 was found only upstream of gene sll1514 and slr1512, and motif 3 found only upstream of gene slr1838, respectively. Whereas it is speculative that these motifs could be involved in regulation of ethanol tolerance in Synechocystis, further experimental verification of their functionality is still needed.

CONCLUSIONS
To improve biofuel tolerance, increasing studies have recently been conducted targeting regulatory genes rather than metabolic genes in the pathways in various native or nonnative biofuel-producing microorganisms, and such efforts have led to a conclusion that manipulation of regulatory genes could provide a practical alternative for strain improvement (11)(12)(13). Meanwhile recent metabolic engineering and synthetic biology studies of photosynthetic cyanobacteria demonstrated that this group of bacteria could be an attractive "microbial cell factory" to produce the third-generation carbon-neutral biofuels or other bioproducts directly from solar energy and CO 2 (17,18,20,21). However, currently very little is known about the regulatory systems involved in biofuel production and tolerance in cyanobacteria; to address the issue, in this study we constructed gene disruption mutants for selected regulatory genes that were found responsive to exogenous ethanol treatments in Synechocystis (23,24), and performed comparative tolerance analysis of the mutants and the wild-type strains. The efforts led to the identification of a transcriptional regulator Sll0794 that was involved in ethanol tolerance in Synechocystis. Using a quantitative iTRAQ-LC- MS/MS proteomics approach coupled with quantitative RT-qPCR and EMSAs, we further determined the possible regulatory network of Sll0794 and identified three gene targets that may be under direct control of Sll0794. As the first regulatory protein discovered so far that was related to ethanol tolerance in Synechocystis, the gene can be used as target for transcription machinery engineering approach in order to improve ethanol tolerance and productivity.