Molecular cloning of the genes encoding two chaperone proteins of the cyanobacterium Synechocystis sp. PCC 6803.

Molecular chaperones help other proteins in their correct folding and assembly. We have cloned the genes, cpn60 and dnaK, which encode proteins belonging to the chaperonin-60 and the 70-kDa heat shock protein families from the transformable cyanobacterium Synechocystis sp. PCC 6803. These genes are present in single copies in the genome, and the major transcripts for each gene are monocistronic. Comparison of deduced amino acid sequences reveals that cyanobacterial chaperonin-60 is equally homologous to bacterial and plant chaperonin-60 proteins while the product of dnaK is more similar to its bacterial homologues than to its eukaryotic counterparts. The DNA fragments sequenced in these studies also contain five other open reading frames. One of them, ORF60-5, encodes a protein whose deduced amino acid sequence shows remarkable similarity to those of a family of peripheral membrane proteins involved in metabolite transport in bacteria. The transcript levels of dnaK and cpn60 of Synechocystis sp. PCC 6803 increase in response to stress conditions such as heat shock, ultraviolet exposure, and oxidative stress. This is one of the first examples of cyanobacterial gene expression being regulated by environmental stresses.

Molecular chaperones are a family of proteins which assist in the correct folding of other polypeptides and/or their assembly into oligomeric structures but which are not components of the final functional structures (1). Their function is invaluable to several processes in prokaryotes and in various compartments of eukaryotic cells (2). Chaperonins are a distinct group of chaperones which include subunits of a chloroplast protein complex involved in the assembly of ribulosebisphosphate carboxylase/oxygenase (Rbu-Pp carboxylase)' (3), the 60-kDa heat shock protein (hsp60) of yeast (4), and the proteins encoded by groESL operon in Escherichia coli (3). One type of chaperonins, called chaperonin-60 (cpn60), is homologous to groEL protein of E. coli (5) and has molecular weight(s) in the range 56,000-61,000. Chaperonin-60 from plastids of higher plants (Rbu-P2 carboxylase-binding protein) and from bacteria (groEL proteins) share several struc-* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertbement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in thispaper has been submitted to the GenBankTM/EMBL Data Bunk with accession number($ J05707 and 505708. To whom correspondence should be addressed. The abbreviations used are: Rbu-Pz carboxylase, ribulose-bisphosphate carboxylase/oxygenase; cpn60, chaperonin-6% bp, base pair(s); hsp, heat shock protein; Tes, 2-([2-hydroxy-l,l-bis(hydroxymethyl)ethyl]amino)ethanesulfonic acid. tural and functional similarities. They are involved in assembly of oligomers into multimeric structures (6,7). Molecular sequencing of genes encoding these chaperonins has revealed a high conservation in the amino acid sequences in chaperonins from E. coli and higher plants. There are, however, some differences between these two chaperonins. The groEL protein of E. coli is active as a functional complex with groES protein (chaperonin-lo), while the existence of chaperonin-10 in higher plants has yet to be demonstrated. The plant chaperonin complex contains two distinct cpn6O subunits, a and p. These two types are as divergent from each other as they are individually from groEL protein of E. coli. Two such divergent types of cpn60 proteins are not found in bacteria.
Heat shock proteins with molecular masses of approximately 70 kDa (hsp70) have been identified and characterized from several organisms (8). It has been known for a long time that cellular concentrations of these highly conserved proteins increase rapidly under stress conditions. Their role during normal growth conditions has only recently been realized. They function as chaperones in the transport of certain secreted or mitochondrial proteins (9,10). Members of hsp7O family have been shown to be present in mitochondria (11,12) and chloroplasts (13,14).
Cyanobacteria present a unique system for studying chaperones. Photosynthetic processes in cyanobacteria are functionally and structurally similar to those in chloroplasts of higher plants. Therefore, cyanobacteria and plants may have similar mechanisms for assembling their photosynthetic protein complexes. Examining the role of molecular chaperones in cyanobacteria, which are amenable to techniques of molecular genetics, should help to elucidate the function of these protein in plastids. Therefore, we have initiated a program to clone and characterize genes encoding important chaperones from cyanobacteria. In this paper, we report characterization of two genes, dnuK and cpn60, encoding homologues of hsp7O and cpn60 proteins, respectively, from the cyanobacterium Synechocystis sp. PCC 6803.

EXPERIMENTAL PROCEDURES
Materials-Cells of Synechocystis sp. PCC 6803 were grown in BG 11 medium buffered with 2 mM Tes-KOH (pH 8.0) under constant aeration at room temperature (approximately 27 "C) unless otherwise mentioned. Radioactive chemicals were purchased from Amersham Corp. The molecular biology reagents and enzymes were obtained from Bethesda Research Laboratories, New England Biolabs, Beverly, MA, or Promega Biotech, Madison, WI. Reagents for oligonucleotide synthesis were from Applied Biosystems, Foster City, CA. The majority of other chemicals and antibiotics were purchased from Sigma.
Genomic Library Screening-Oligonucleotide probes corresponding to the most highly conserved regions of cpn6O and hsp70 proteins were designed according to the codon preferences in Synechoystis sp. PCC 6803 (Table I). The oligonucleotides were synthesized on an Applied Biosystems DNA synthesizer (model 381A) and labeled with 32P using T4 polynucleotide kinase (15). These probes were used to The amounts of RNA obtained from cells that had been treated with different stress conditions did not significantly vary. For Northern blots 5 pg of RNA was electrophoresed on denaturing agarose gel containing formaldehyde and then transferred to a nylon membrane filter which was hybridized with labeled probe. Slot blots were performed using 5 pg of total RNA. Levels of specific transcripts in total RNA isolated from cells grown under different stress conditions were quantitated by scanning autoradiograms of slot blots using an LKB Ultroscan XL enhanced laser densitometer. The quantitative data presented in this paper is averaged from at least two observations.

RESULTS
Isolation and Characterization of Genomic Clones Containing cpn60 of Synechocystis sp. 6803-The amino acid sequences of several cpn60 proteins, including the a subunits of Rbu-P2 carboxylase binding proteins from wheat (3), the groEL protein of E. coli (3), and the hsp60 of yeast (4), are known from the nucleotide sequences of the respective genes. These proteins are induced by heat shock and share many regions of homology. Oligonucleotides corresponding to three conserved regions of cpn60 proteins were used to screen a genomic library (Table I), and several identical clones hybridizing to them were isolated. Hybridization of these oligonucleotides to the groEL gene in genome of E. coli was not significant enough to affect the signal to noise ratio during screening. Restriction mapping and Southern analysis of these clones were used to identify the region that contained the cpn60 gene. Two overlapping fragments, a StyI fragment (2422 bp) and a SstI-EcoRV fragment (2640 bp), were subcloned in pBluescript SK(+). Both strands of these fragments were sequenced. polypeptide that is 551 amino acids in length and acidic in its isoelectric point (4.9) and has a molecular mass of 57,828 Da. It shows remarkable identity in amino acid seqdence to various members of the hsp60 family which are now termed chaperonin-60 proteins ( Fig. 2) (2). Accordingly, this open reading frame is named cpn60. Comparison of the deduced amino acid sequence of cpn6O from Synechocystis sp. PCC 6803 to those of other chaperonins revealed that cyanobacterial cpn60 has 56-58% identity to other bacterial homologues (Table 11). The similarity of cpn60 of Synechocystis sp. PCC 6803 to eukaryotic cpn60 varies; it more closely resembles Rbu-Pp carboxylase binding proteins of chloroplasts (56-57% identity) than chaperonins from other eukaryotes (47% identity). In contrast, the groEL protein of E. coli is less homologous to eukaryotic cpn6O proteins (47-51% identity) than to their prokaryotic counterparts (58-59% identity). This difference may reflect the evolutionary relatedness of cyanobacteria to chloroplasts. When the deduced amino acid sequence of cpn60 of Synechocystis sp. PCC 6803 was used to search the data base of GenBank, homology was also found to a polypeptide encoded by an incompletely sequenced, unidentified open reading frame near the operon containing genes for p and t subunits of ATP synthatase of another cyanobacterium, Synechococcus 6301 (21).
The amino acid sequences deduced from the other reading frames shown in Fig. 1 were used to search the GenBank data base. The reading frame ORF60-5a extends beyond the sequenced region and encodes a polypeptide with more than 258 amino acids while ORF60-3 codes for a polypeptide containing 246 amino acids. The proteins encoded by these two reading frames are not similar to any polypeptide sequence in the data base. The reading frame ORF60-5 encodes a protein homologous to several known peripheral, ATP-binding subunits of membrane-protein complexes involved in translocation of metabolites. For example, the protein encoded by cysA gene of Anacystis nidulans is required for sulfate transport (22), while the product of pstB of E. coli is involved in transport of phosphate (23). Fig. 3 shows two particularly conserved regions in these proteins. The reading frame ORF60-5 encodes a protein whose features are similar to these proteins. The amino acid residues proposed to be involved in ATP binding are conserved in ORF60-5 product (Fig. 3). The hydropathy profile of ORF60-5 protein lacks long hydrophobic segments typical of integral membrane proteins (data not shown). Thus,

A Q V A T V S S G T N P E V G A M I A D A M D K V T K D G V I T V
~~~AGGAGTCCATCCCTCAACRCCG~TTGG~GTGGTGG~GGGAT~CAGATCGACCOGGGCTATATTTCTCCCTACTT~TCACCGATAGC~TCGC ORF60-5 protein seems to be a member of the family of peripheral proteins involved in metabolite transport. Since the amino acid conservation between ORF60-5 and other proteins (Table 111) is limited, it is possible that protein encoded by ORF60-5 belongs to a transport system not yet characterized at a sequence level. The number of cpn60 genes in the genome of Synechocystis sp. PCC 6803 was determined by genomic Southern analysis in which the blot was probed with labeled fragments specific for cpn60 genes (Fig. 4). There were major DNA fragments of expected sizes hybridizing with this probe in each lane, demonstrating that there is no other gene in the genome that is highly homologous to the gene that has been cloned in this study. There is, however, one weakly hybridizing fragment in each lane (Fig. 4). Molecular cloning and partial sequencing of this fragment revealed that the weak homology observed in Southern analysis is not significant at the level of deduced amino acid sequence (data not shown).  ' Homology between two proteins is expressed as percent identity (proportion of identical residues shared by two protein sequences that have been optimally aligned) and percent similarity (proportion of identical residues and conserved amino acid replacements shared by two protein sequences when optimally aligned). Subunit (Y of Rbu-P, carboxylase-binding protein.  Table 111. The residues that have been proposed to be involved in ATP binding are underlined in the consensus sequence (35).  An adjacent, overlapping, 2.04-kilobase XbaI fragment was later sequenced.

Molecular Cloning and Characterization of dnuK Gene
The 3,770-bp region sequenced in this study contains three major open reading frames (Fig. 5). ORF70-5 is at least 300 amino acids in length and continues into the upstream region. ORF70-3 is more than 191 residues in length. Comparison of the amino acid sequences deduced from these reading frames with those in GenBank revealed that they are not similar to any protein sequence in the data base. The third reading frame codes for a polypeptide that is 636 amino acids in length, with an acidic isoelectric point of 4.6 and a 67,620-Da molecular mass (Fig. 6). This protein is highly homologous to

amino acid sequences of a portion of A C T T T C C G G C G T G A G C C A A A C G G A M T T A A C C T G C C T G C C C T T~T C A C T G C C A C C C A G G A T G G G C C T~~T C T A~~C~C C C T G T C C C~~G T T T containing dnaK of Synechocystis G A R G A A A T T T G C T C C G A C C T G A T T G A T C G T T G T G G T A T T C C C G T G G A A A R T~C A T T C G G G A T G C C~T C~T~T C T~C C T~T G~T C G~T C consensus sequence of -10 region of heat T T G T T G G C G G T T C C A C C C G G A T T C C C G C C G T G C A A G A A G T G G T A A A T G G C C A A G~T T C T T~C~G A T C C C~C C~G G A G T T~C C C C G A T G~G T A G T~
lined.

c G T c G G A G c T G c T A T c c A R G G G G G T G T A C T T T C T G G G G R A
the gen;omic region shown in Fig. 1 Table IV shows amino acid sequence identity and similarity values for hsp7O homologues from Synechocystis sp. PCC 6803, E. coli, maize, petunia, and a consensus sequence derived for hsp70 proteins. The products of dnaK of Synechocystis sp. PCC 6803 and E. coli are remarkably conserved (57% identity) and are slightly less homologous to hsp70 proteins of higher plants (46-49% identity).

K E A V A Q E D D A K I Q T V M P E L Q Q V L Y S I G S N M Y Q Q G C T G G A G C A G A G G C T G G G G T A G G C G C T C C C G G T G C T G G C C A G A E A G V G A P G A G P E A G T S S G G G D D V I D A E F S E P C G G A~T A G G T C T G G G T T T T A A A A C C T G T A T T G A R A G T T T
In Southern analysis, genomic DNA of Synechocystis sp. PCC 6803 was digested to completion with different. restriction endonucleases and separated on 0.7% agarose gel. After transfer to nylon membrane, the DNA was probed with labeled fragments specific for dnaK genes (Fig. 7). In each lane, DNA fragments of expected sizes hybridized with this probe, demonstrating that there is a single dnaK gene in the genome of Synechocystis sp. PCC 6803.
Expression of cpn60 and dmK-In Northern analysis, an RNA species of 2.0 kilobases hybridized with the radioactive probe specific for cpn60 (Fig. 8). Considering the sizes of coding regions of cpn60 (1659 bp) and the neighboring reading frames (984 and 738 bp), it is unlikely that the major transcript species for cpn60 is polycistronic. The radioactive probe for dnaK mainly hybridized with a RNA species about 2.2 kilobases in length (Fig. 8). The coding region of dnaK is 1911 bp in length. Therefore, the major RNA species for dnaK is monocistronic. Another band, longer in size, was also observed in Northern analysis. The probe hybridized to this RNA species could not be washed away by high stringency washing (80 "C, 40 mM sodium phosphate, 0.5% sodium dodecyl sul-fate, 0.05% bovine serum albumin). The presence of two RNA species raises two possibilities. First, there could be two different promoters and/or terminators, thus transcribing two RNA species. Second, the RNA is originally transcribed as a polycistronic message and then rapidly processed. A computer-assisted search for potential factor-independent prokaryotic transcriptional terminator sites (26) in the sequence of the region shown in Fig. 6 revealed a sequence 59 bp downstream from the stop codon of dnaK that has a stretch of thymine residues and a preceding dyad symmetry and can act as a strong transcriptional terminator.
However, there was no such site in the 5"flanking region of the dnaK gene, suggesting that the longer transcripts seen in Northern blot are derived from the promoter of ORF70-5. The upstream regions of both cpn60 and d m K genes contain sequences similar to the consensus sequences of heat shock promoters of E. coli (33). A sequence (CCCCATTTA) found 144 bp upstream from the start of the cpn60 region is identica1 to the consensus -10 region of heat shock promoters of E. coli (CCCCATtTa) (Fig. 2). Similar but less conserved sequences (CCCATCGE and CCCCAGGCA) are also present 50 and 84 bp upstream from the initiation ATG of dnaK of Synechocystis sp. PCC 6803 (Fig. 6). Sequences similar to the consensus for the -35 region of heat shock promoters of E. coli (TNtCNCcCTTGAA) are not observed in the upstream regions of cpn60 and dnaK genes of Synechocystis sp. PCC 6803. Accordingly, the mechanism of heat shock induction in Synechocystis sp. PCC 6803 may be similar but not indentical to that in E. coli.
Efiect of Heat Shock and Other Stress Conditions on Tran-  agarose gel containing formaldehyde, transferred to a Genescreen nylon membrane, and hybridized with probes specific to cpn60, d m K , or psaD. Lanes containing RNA from cultures grown at 20 "C were exposed twice as long to x-ray film as those with RNA from cultures that had been subjected to heat shock.
script Levels of cpn60 and dnaK Genes-Cells of Synechocystis sp. PCC 6803 were grown at 20 "C with constant aeration. They were subjected to heat shock by transferring them to a water bath maintained at higher temperatures. Total RNA isolated from these cells was used for Northern analysis (Fig. 8). Heat shock at 42 or 47 "C resulted in highly elevated levels of RNA for cpn60 as well as dnaK. However, extensive degradation of these transcripts was evident from the smear observed in these blots. When the same filter was reprobed with labeled psaD gene which encodes subunit I1 of photosys- tem I, a sharp single band was observed in each lane (Fig. 8).
Thus, degradation was characteristic of cpn60 and dnaK transcripts and did not reflect on the overall quality of RNA. The quantitations of levels of specific RNA species were performed by slot blot analysis. Fig. 9 shows levels of cpn60 and dnaK RNAs at different time intervals after transferring cells of Synechocystis sp. PCC 6803 to 42 "C. It took approximately 10 min for the temperature of the culture to rise from 20 to 42 "C, and by that time the level of cpn60 RNA had already increased by 8-fold. The maximal level (30 times that of control RNA) was reached after 90 min at 42 "C. Then the proportion of cpn60 transcripts decreased gradually during the next couple of hours (Fig. 9). Heat shock also resulted in elevated amounts of dnaK transcripts. The relative increase of dnaK RNA was 25-fold after 90 min at 42 "C. The levels of dnuK RNA then declined rapidly and reached basal level in the next 90 min. This sharp decrease is consistent with the extensive degradation of dnaK RNA observed in the Northern analysis (Fig. 8). Levels of message for psaD, a representative of nonchaperone genes, were measured during this experiment and were found to decrease immediately following the heat shock (Fig. 9). This RNA species came back to normal levels after 5 h at the elevated temperature. The levels of 23 S RNA in the total RNAs remained unchanged during heat shock.
In addition to elevated temperature, several other stress conditions are also known to induce expression of dnaK and groESL of E. coli (27). We tested four stress conditions, namely ethanol, nalidixic acid, hydrogen peroxide, and ultraviolet radiation, for their ability to affect levels of transcripts stress conditions were quantitated by slot blot analysis of total RNA. The cells were incubated with ethanol, nalidixic acid, or hydrogen peroxide for 30 min or were exposed to 0.5 J of ultraviolet light, for cpn60, dnaK, and psaD in Synechocystis sp. PCC 6803 (Fig. 10). These agents were added to the culture growing at 20 "C and incubated at the same temperature for 30 min. Ultraviolet irradiation was administered to 100 ml of cells in an open glass dish (20-cm diameter) in UV Sratalinker 1800 (Stratagene, CA). After the treatments, the cells were harvested and RNA was extracted. Ethanol increased levels of RNA for cpn60 and dnaK, as well as psaD, to varying degrees. In contrast, the other reagents tested resulted in elvated levels of RNA for cpn60 and dnaK but not for psaD. Moreover, the levels of psaD transcripts decreased by as much as 50%. On an average, these stress conditions had less effect on the levels of cpn60 RNA (maximum increase of 17 times) than those of dnaK RNA (maximum increase of 25 times). Both cpn60 and dnaK genes were induced less by nalidixic acid than by the other reagents. The proportion of 23 S RNA in the total RNA remained approximately the same in all treatments.

DISCUSSION
Cyanobacteria have photosynthetic processes that are functionally and structurally similar to those in chloroplasts of higher plants. On account of their smaller and simpler genome and their capacity for homologous recombination following natural transformation, these organisms, especially Synechocystis sp. PCC 6803, have been used extensively for mutational analysis of photosynthetic processes (25,(28)(29)(30)(31). In light of the phylogenetic relationship (23) and functional resemblance (32) of cyanobacteria to chloroplasts of higher plants, it appears that the mechanisms of assembling their photosynthetic complexes should be similar. Accordingly, we initiated a project of cloning and characterizing genes of Synechocystis sp. PCC 6803 that encode proteins involved in assembly and translocation of other proteins. In this paper, we report the cloning of two molecular chaperones.
The deduced amino acid sequences of the cpn60 and dnaK proteins of Synechocystis sp. PCC 6803 are remarkably conserved when compared with the corresponding proteins from other bacteria or eukaryotes (Tables I1 and IV). The cpn6O sequences compared in Table I1 can be classified into various groups. The bacterial cpn60 proteins are more homologous to each other than to eukaryotic proteins. The cyanobacterial ?ding Molecular Chaperones cpn6O is an exception, as the percentage of identical residues shared by this protein and CY subunit of Rbu-Pz carboxylasebinding protein from higher plants is as high as when compared with the cpn6O protein of bacteria. This may reflect a closer evolutionary relationship between cyanobacteria and chloroplasts. The dnaK protein of cyanobacteria, however, is slightly more homologous to its counterpart from E. coli.
Thus, the greater similarity between cpn60 of cyanobacteria and plants may be due to their functional resemblance such as Rbu-Pz carboxylase binding. Therefore, a more appropriate way to group these sequences would be according to their site of function. The mitochondrial cpn6O from yeast, human, and hamster have higher homologies to each other while the chloroplast-destined Rbu-P2 carboxylase-binding protein of wheat and castor bean, as well as cpn60 of the photosynthetic cyanobacterium, are more similar to each other.
There are some important differences between homologous chaperones of Synechocystis sp. PCC 6803 and E. coli. First, the organization of genes in the genome is different. The genes from E. coli exist in operons and are expressed as bicistronic RNAs. The groES gene is present 43 bp upstream of groEL, while the d n d gene is located 88 bp downstream of dnaK. In contrast, the neighboring regions of cpn60 or dnaK genes in Synechocystis sp. PCC 6803 do not contain groES or dnaJ genes, respectively. Northern blots shown in Fig. 8 strongly suggest that the major RNA species for each gene is monocistronic. Second, the effects of various stress conditions on expression of these genes are not the same in Synechocystis sp. PCC 6803 and in E. coli. For example, hydrogen peroxide does not induce synthesis of groEL protein in E. coli (27) but results in elevated levels of cpn60 transcripts in Synechocystis sp. PCC 6803 (Fig. 10). These differences could be better understood by identifying and characterizing promoters associated with cpn60 and dnaK of Synechocystis sp. PCC 6803.
The results presented in this paper provide one of the first examples of inducing expression of specific cyanobacterial genes by stress conditions. The effect of heat shock on protein synthesis in the cyanobacterium Synechococcus sp. PCC 6301 has been previously demonstrated (34). Upon heat shock, the growth rate of cells decreases and certain proteins are preferentially synthesized. The molecular characterization of these events has not yet been unraveled. Figs. 8, 9, and 10 clearly demonstrate induction of cpn60 and dnuK of Synechocystis sp. PCC 6803 as a response to various stress conditions. Similar patterns of induction of these genes and the existence of sequences homologous to the consensus sequence of the heat shock promoters of E. coli in the upstream regions of both of these genes suggest that dnaK and cpn60 are part of a regulon. A rise in temperature results in a huge but transient increase in the transcription of these genes (Fig. 9). After 2 h, the amount of RNA specific to these genes decreases to prestress levels, apparently due to massive and specific degradation of dnuK and cpn60 transcripts as well as a rapid decline in their rate of synthesis. The transient nature of heat shock response correlates with the growth pattern of the bacteria under stress conditions. The growth of Synechocystis sp. PCC 6803 stops immediately following a heat shock and is resumed after about 6-8 h of lag period (data not shown). Similar trends in expression have been observed in other organisms (8,24). It has been suggested that the heat shock response in most organisms is transient at moderate temperatures while sustained at higher temperatures (24). In Synechocystis sp. PCC 6803, the response is transient when cells are shifted to 42 or 47 "C. The promoters of dnaK and cpn60 can be used to transiently express other genes at a very high level. A 200-bp region upstream of dnaK gene has been found to impart heat shock-inducible expression to a "promoterless" gene encoding chloramphenicol acetyltransferase.* Further 19* characterization of the promoters of both dnaK and cpn60 is 20. in progress so that expression of specific genes can be precisely modulated through changes in environmental cues.