Analysis of the Rhodobacter capsulatus puf operon. Location of the oxygen-regulated promoter region and the identification of an additional puf-encoded gene.

In an attempt to identify features of an oxygen-regulated promoter, we have determined the location of transcription initiation for the puf operon. The position for the oxygen-regulated promoter was demonstrated by several independent means to be located 699 base pairs (bp) upstream from the pufB structural gene. DNA sequence analysis of the promoter region demonstrates the presence of a 26-base pair region of dyad symmetry followed by a sequence containing homology to promoters which use the RNA polymerase sigma 60 subunit (ntrA) for recognition of DNA. In addition to the oxygen-regulated promoter, a region responsible for low-level constitutive expression of the puf operon was shown to initiate transcription 511 bp upstream from the pufB gene. In contrast to the oxygen-regulated promoter, this second promoter contains no obvious secondary structure nor sequence homology to ntrA-dependent promoters. DNA sequence analysis demonstrates the existence of an additional open reading frame (designated as pufQ) that is located between the promoters and the pufB structural gene. A translational fusion of pufQ to lacZ was used to demonstrate that pufQ is efficiently translated and regulated in a manner analogous to a translational fusion of pufM to lacZ. Finally, we also demonstrate that puf operon transcription initiation and regulation does not involve any puf-encoded gene products.

In an attempt to identify features of an oxygenregulated promoter, we have determined the location of transcription initiation for the puf operon. The position for the oxygen-regulated promoter was demonstrated by several independent means to be located 699 base pairs (bp) upstream from thepufB structural gene. DNA sequence analysis of the promoter region demonstrates the presence of a 26-base pair region of dyad symmetry followed by a sequence containing homology to promoters which use the RNA polymerase ueo subunit (ntrA) for recognition of DNA. In addition to the oxygen-regulated promoter, a region responsible for low-level constitutive expression of the puf operon was shown to initiate transcription 511 bp upstream from the pufB gene. In contrast to the oxygen-regulated promoter, this second promoter contains no obvious secondary structure nor sequence homology to ntrA-dependent promoters.
DNA sequence analysis demonstrates the existence of an additional open reading frame (designated as pufQ) that is located between the promoters and the pufB structural gene. A translational fusion of pufQ to lac2 was used to demonstrate that pufQ is efficiently translated and regulated in a manner analogous to a translational fusion of pufM to lucZ. Finally, we also demonstrate that puf operon transcription initiation and regulation does not involve any puf-encoded gene products.
Purple non-sulfur photosynthetic bacteria are facultative anaerobes that synthesize their photosynthetic apparatus only under conditions of reduced oxygen tension. Rhodobacter capsulatus synthesizes three membrane-bound pigment-protein complexes. The light harvesting I and I1 complexes (LH-I and LH-11, respectively)' are antennae which absorb light and transmit the energy to a third pigment-protein complex, the reaction center (RC) that serves as the electron donor during photosynthesis. The LH complexes are each composed of two small membrane-spanning polypeptides, CY and 0, that function as scaffolding to anchor bacteriochlorophyll and carotenoids. The RC is composed of three membrane-spanning polypeptides L, M, and H that also function to bind bacteriochlorophyll and carotenoids as well as quinones. Recently, * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

to the GenBankTM/EMBL Data Bank with accession numbeds)
The nucleotide sequence(s) reported in thispaper has been submitted
it has been demonstrated that operons which encode these pigment-binding polypeptides, such as the puc operon that encodes the LH-I1 a and / 3 polypeptides (pucA and B ) , the puh operon that encodes the RC-H polypeptide (puhA) as well as the puf operon that encodes for the LH-I CY and 0 polypeptides (pufA and B ) and the RC-L and M polypeptides (pufL and M ) are all transcriptionally repressed by molecular oxygen (1-4). The mechanism(s) whereby oxygen regulates the expression of these and other genes is unknown.
In order to determine what cis-acting regions are involved in oxygen regulation of transcription, it is important to rigorously determine the location of transcription initiation for oxygen-regulated operons. Towards this goal we have undertaken a "functional analysis" of the puf operon promoter region using deletion and insertion (interposon) mapping techniques as well as mung bean nuclease mapping of mRNA transcripts hybridized to DNA probes. The results of this study demonstrate that oxygen-regulated transcription initiation for the puf operon occurs 699 bp upstream from the pufB gene. This oxygen-regulated "photosynthetic promoter" contains a sequence that closely resembles the consensus sequence recognized by the urn (ntrA) subunit for RNA polymerase which is used in a variety of bacteria for genes involved in nitrogen metabolism. In addition, this study also provides evidence that a second site for transcription initiation occurs 188 bp downstream from the position of primary transcription initiation and that this second promoter is responsible for low-level constitutive expression observed during aerobic growth.
Besides evidence for newly defined promoter regions, we also demonstrate the existence of an additional gene encoded by the puf operon (pufQ) that is located between the promoters and the pufB structural gene. A translational fusion of pufQ to the Escherichia coli lac2 gene was used to demonstrate that pufQ is efficiently expressed and regulated in a manner analogous to the pufM gene. Finally, we also demonstrate that puf operon expression and regulation does not involve any gene products encoded by the puf operon.

RESULTS
Deletion Mapping of the Promoter Region-In order to determine the extent of DNA that is required for puf operon expression, a nested set of deletions were constructed in the region of DNA upstream from the pufB gene. To facilitate The "Experimental Procedures" are presented in miniprint at the end of this paper. Miniprint is easily read with the aid of a standard magnifying glass. Full size photocopies are included in the microfilm edition of the Journal that is available from Waverly Press. respectively. The upper most spectrum is that of chromatophores analysis, the deletions were constructed in a plasmid-encoded copy of the puf operon (pCB552) that also contained a translational fusion of the pufM gene to the E. coli lac2 gene (see Fig. 1 and Ref. 9 for a restriction map). To prevent transcripts that had initiated from outside the R. capsulatus DNA from entering the puf operon, a restriction fragment containing Spc' flanked by transcription/translation termination sites (R fragment; Refs. 10 and 11) was inserted upstream from each deletion. Following construction, plasmids containing the deletions were introduced into R. capsulatus and assayed for P-galactosidase levels under aerobic and anaerobic (photosynthetic) conditions. The results of this analysis ( Fig. 1) demonstrate that the deletions fall into three categories. In one class, plasmids containing 947 and 802 bp of DNA upstream from pufB (pCB552 and pCB552:dell, respectively) exhibited normal expression and regulation. Thus, these constructions appear to contain all the necessary components required for puf operon expression. In contrast, a construction which contains 560 bp of DNA upstream from pufB (pCB552:de12) retained normal activity under aerobic conditions but exhibited no induction of transcription under anaerobic conditions. Finally, a third class of deletions that contained DNA extending 407, 341, and 146 bp upstream from the pufB gene (pCB552:de13, de14, and de15, respectively), exhibited low levels of @-galactosidase activity under both aerobic and anobtained from strain MW442 containing Q inserted at position 1. In contrast, insertion of Q at position 2 shows only slight absorbance in this region (middle peak) and insertion of R at position 3 results in the absence of absorbance in this region (bottom line)? Note, in order for the R 1 LH-I peak to be on scale, the chromatophore preparation from this strain was diluted 2 . 5~ relative to the chromatophore preparation obtained from strains containing Q 2 and 3.

FIG. 3. Mung bean nuclease mapping of the 6'-mRNA. Autoradiograms
A and B show the results of mung bean nuclease mapping analysis for transcription initiation using DNA probes A and

B, respectively. Probe A is a 799-bp
NcoI-EcoRI restriction fragment from the EcoQ region that is 5'-end-labeled a t the NcoI site located 29 bp upstream from pufe. This probe shows two sites of mRNA protection (P,n and P,.n). Probe B, used to resolve the P,n transcript to a nucleotide level, is a 645-bp FspI-EcoRI restriction fragment 5'-endlabeled at the FspI site. The mRNAprotected DNA was separated on a denaturing 8% polyacrylamide gel next to G+A and C+T chemical cleavage reactions (18) performed on aliquots of the probes.
aerobic conditions. These results demonstrate that a region 407 to 560 bp upstream from pufB encodes sequence inforof DNA spanning from 560 to 802 bp upstream from pufB mation responsible for low-level expression observed under encodes sequence information necessary for anaerobic induc-aerobic conditions. tion of transcription, whereas a region of DNA spanning from Znterposon Mapping of the Promoter Region-As an inde-pendent "functional" assay for the in uiuo location of the puf operon promoter, we inserted a transcription/translation terminator (interposon n) at various positions within the chromosome of strain MW442. We chose MW442 for this analysis since this strain contains a mutation within the puc operon that results in a loss of LH-I1 absorbance (23). Thus, this strain allows us to directly measure puf operon expression by observing RC and LH-I absorbance at 805 and 872 nm, respectively (Fig. 2B). As demonstrated in Fig. 2, insertion of n at a position 790 bp upstream from pufB (n 1) has no effect on puf operon exp~ession.~ In contrast, insertion of Q a t position 2, located 692 bp upstream from pufB, has a marked reduction in puf operon expression. Finally, insertion of Q at position 3, located 407 bp upstream from pufB, results in the complete absence of puf operon expression. These results, which agree with the deletion analysis, demonstrate that a region of DNA between 692 and 790 bp upstream from the pufB gene is required for oxygen-regulated transcription initiation, whereas a position between 407 and 692 bp is responsible for low-level expression of the puf operon.
Mung Bean Nuclease Mapping of mRNA 5' Ends-In order to determine, at a nucleotide level, which base initiates transcription, we performed 5'-mRNA mapping by digesting 32Plabeled DNA probes that were hybridized to cellular mRNA with mung bean nuclease. Two probes were used to locate the start of transcription within the regions of DNA mapped by deletion and interposon analysis. Using a NcoI-EcoRI probe (see "Experimental Procedures") we were able to observe two start sites for transcription initiation (Fig. 3A). One site was a less intense region of protection that occurs 511 bp upstream from pufB (Ppun). This is within the region of DNA, shown by deletion analysis, to be responsible for low-level constitutive (aerobic expression). The other more intense band of mRNA protection observed with this probe (PPufl) occurred much farther upstream, in the region of DNA shown by deletion and insertion mapping to be responsible for oxygenregulated transcription initiation. This primary transcript was resolved at a nucleotide level, to a position 699 bp upstream from pufB, using a FspI-EcoRI probe (Fig. 3B). Fig. 4 shows the position of transcription initiation mapped by this procedure relative to the positions mapped by deletion and insertion mutagenesis.
Sequence Analysis of the Promoter Region and the Identification of an Additional puf Operon Gene-Analysis of the DNA sequence located between the promoter region and the pufB structural gene (Fig. 4) shows the existence of an open reading frame (termed pufQ) that could potentially encode a protein composed of 74 amino acids. An excellent codon preference plot (a statistical program for identifying potential genes; Ref. 24) is generated for pufQ when using a codon frequency table obtained from known R. capsulatus genes (Fig. 5). This would be expected for open reading frames efficiently translated by R. capsulatus. Furthermore, a plasmid containing a translational fusion of the E. coli lac2 gene to the 13th amino acid of pufQ (pCB532) demonstrates a level of @-galactosidase activity and regulation by molecular oxygen that is comparable to what is observed with a plasmid containing a pufM:lacZ fusion (Fig. 1). These analyses, in con-Although insertion of Q 1 into the chromosome 790 bp upstream from the pufB gene does not affect puf operon expression, both R 1 and fl 2 disrupt the bacteriochlorophyll A (bchA) gene that overlaps the puf operon promoter region (D. A. Young, C. E. Bauer, and B. L. M m s , manuscript in preparation). Thus, the spectral analyses shown in Fig. 2 are derivatives of MW442 that contain their respective 9 insertions as well as a plasmid-encoded trans addition of the bchA gene.  DNA sequence of the p u f operon region located upstream from the pufB gene. The DNA sequence is numbered relative to the start of the pufB gene that was arbitrarily assigned a value of 0. Deletions and insertions were constructed between the bases indicated. The start sites for transcription initiation, as mapped by mung bean nuclease, are indicated with a wavy arrow. The most probable base for transcription initiation, which was determined as the furthest base protected by mRNA with various concentrations of nuclease, is highlighted by reuerse contrast. Direct repeats and inverted repeats located upstream from the PPun site of transcription initiation are indicated by straight arrows. The location of the pufQ gene and the start site of the pufB gene are indicated by the predicted amino acid sequence. The solid bar below the DNA sequence denotes the probable Shine-Dalgarno ribosome binding site. junction with the promoter mapping studies described above, support our conclusion that pufQ is the first structural gene of the puf operon.

G~G t C A A G T c~c T G C G T G A c G~G G c~G A G A A G G c G G c T c T c G A T c A G G G G G c
Puf Operon Expression Is Not Dependent on puf Operon Gene Products-The mechanism whereby the puf operon is regulated by molecular oxygen is unknown. Transcription could be directly or indirectly regulated by a puf operon encoded gene product(s). To test this possibility, we constructed a strain of R. capsulatus (SB1003:de1.,502) containing a deletion of the puf operon by replacing the pufQ through pufM genes with interposon Q. A reporter plasmid for puf operon expression (pCB532:Km) was then introduced into SB1003:de1.,502. This plasmid contains the puf operon promoter region followed by a translational fusion of pufQ to lacZ. Using this strain, we can assay whether the absence of pufQ through pufM has an effect on the @-galactosidase activity expressed from the plasmid. The results of this analysis (Table I) demonstrate that the absence ofpuf operon encoded gene products has no effect on the expression of thepufQdac2 fusion. Thus, transcriptional regulation of puf operon does not appear to directly or indirectly involve any puf encoded genes.  a Units are expressed as micromoles of 0-nitrophenyl-0-D-galactoside hydrolyzed/min/mg protein.

DISCUSSION
Features of the puf Operon Promoters--In this investigation, we have undertaken a study to determine the location of transcription initiation for the puf operon. By employing a functional analysis of the puf operon through deletion and insertion mutagenesis, as well as a physical analysis of the mRNA length by Mung bean nuclease mapping, we have mapped the site for oxygen-regulated transcription initiation to a position 699 bp upstream from the pufB gene. Since it had previously been presumed that pufB was the first structural gene of the puf operon (25), the extended distance of the P,,,, promoter region from the pufB structural gene was at first surprising. However, sequence information we obtained upstream from the pufB gene demonstrates the existence of an additionai puf operon gene (pufQ; see below) which is located between the promoter region and pufB. Therefore, in retrospect, it is not surprising that transcription initiation occurs within the region mapped. It should also be noted that an earlier attempt at S1 nuclease mapping of the puf operon 5'-mRNA suggested that transcription initiation or a site of mRNA processing occurred just upstream (approximately 100 bp) from the pufB gene (25). However, our deletion analysis shows no evidence for transcription initiating from this region (Fig. 1). Since the previous S1 nuclease mapping study does not correlate with the functional length of the puf operon, we presume the discrepancy is due either to mRNA processing and/or the use of probes which did not extend into the promoter region. Additional studies on the stability of the puf-encoded mRNA will have to be undertaken to determine if the region -100 bp upstream from the pufB gene is indeed a position of mRNA processing.
The existence of a second low-level constitutive promoter (P,",) located 511 bp upstream frompufB (188bp downstream from PpUf,) was demonstrated by both deletion and insertion analysis and further confirmed and mapped to a nucleotide level with mung bean nuclease mRNA mapping. By aligning Ppun and P,,, at the start of transcription (Fig. 6A), we observe a fair degree of homology in the -1 to -8 region (6/ 8 bp) and in the -35 to -45 region (9/14 bp). However, with the exception of a ntrA consensus sequence present in PpUrl (see below), neither of these regions of homology exhibit obvious similarity to the canonical promoter sequence (26) used by E. coli. One area of striking difference between P P ,~ and PPua is the numerous regions of secondary structure, such c c as direct repeats from +1 to -14 and from -72 to -92 and inverted repeats from -25 to -53 and from -72 to -95, that are present in the more highly expressed PPufl promoter but absent in the PpUn promoter (Fig. 4). It should also be noted that the existence of two promoters in highly regulated prokaryotic operons is not uncommon. For example, the seven ribosomal RNA operons from E. coli (rrn) are each thought to contain two tandem promoters separated by a distance of 110-120 bp (27)(28)(29). Furthermore, in analogy to what we observed for the puf operon, the upstream rrn promoter(s) (PI) is highly expressed and regulated (by stringent response and growth rate dependence), whereas the downstream promoter(s) (P2) is weakly expressed and unregulated (30,31). It remains to be seen whether other photosynthetic genes exhibit two promoters.
In an attempt to determine what features of the PPun promoter are important for expression and regulation, we scanned other oxygen-regulated genes from R. capsulatus for similar promoter sequences. One such candidate is the puhA (RC-H) gene that is encoded by a 1400-base transcript that is expressed and regulated by oxygen and light intensity in a manner similar to the puf operon transcript (2). Since the RC-H structural gene encompasses 760-bp, the position of transcription initiation must occur no more then 640 bp upstream from the start of translation. Using a homology search program (32), we scanned a 3100-bp region of DNA upstream from the puhA gene (33) for homology to the P,,,, promoter. The result of this analysis demonstrates the existence of a region of DNA located 440 bp upstream from puhA that exhibits a striking degree of homology to the PPun promoter (Fig. 6B). Interestingly, conserved in both sequences is a region containing extensive homology to the consensus sequence for promoters using the ntrA u subunit (34-36,44). The ntrA u subunit appears to be a constitutively expressed minor subunit of RNA polymerase that is present in a number of diverse gram-negative bacteria (for a review of ntrA promoters see Ref. 44). One interesting feature common to promoters using the u60 subunit is that they all appear to be highly induced by an element which binds upstream from the promoter. For example, genes involved in nitrogen metabo-  subunit, a potential region for binding such an upstream activator could be the conserved 26-bp region of diad symmetry located from -24 to -50 bp upstream from the start of transcription (Fig. 6C). Direct evidence for the involvement of one or more of these regions in regulating expression, however, will have to await the future isolation of cisand trans-acting mutations that effect photosynthetic gene expression.
Identification of pufQ-Sequence data demonstrates the existence of an open reading frame between PPufl and pufB.
Evidence for this open reading frame encoding an actual protein is presented by its excellent codon usage as well as by the /3-galactosidase activity expressed from the pufQ fusion to lac2 that exhibits a level of activity and regulation analogous to that observed for the pufM fusion to lac2 (Fig. 1). Besides evidence for translation, we have additional evidence that the protein encoded by pufQ is required for bacteriochlorophyll bio~ynthesis.~ Thus, the puf operon appears to encode a protein required for bacteriochlorophyll biosynthesis as well as for proteins that bind bacteriochlorophyll to form the light harvesting and reaction center complexes.
Finally, sequence analysis for the Rhodobacter sphaeroides pufB and pufA genes along with several hundred base pairs of DNA upstream from the pufB gene has recently been published (42). Comparison of the DNA sequence from the pufQ gene obtained in this study with the R. sphaeroides sequence demonstrates the existence of the pufQ gene in this related organism (Fig. 7). Thus, the genomic organization for the puf operon in R. capsulatus also appears to be conserved in other purple non-sulfur bacterium. Presumably, additional sequence information and mRNA mapping data for the R. sphaeroides puf operon, as well as additional studies on oxygen-regulated promoters from other operons, should shed more light on the DNA sequences involved in the initiation and regulation of transcription in these organisms.