Molecular cloning and sequence of the B880 holochrome gene from Rhodospirillum rubrum.

Restriction fragments of genomic Rhodospirillum rubrum DNA were selected according to size by electrophoresis followed by hybridization with [32P]mRNA encoding the two B880 holochrome polypeptides. The fragments were cloned into Escherchia coli C600 with plasmid pBR327 as a vector. The clones were selected by colony hybridization with 32P-holochrome-mRNA and counterselected by hybridization with Rs. rubrum ribosomal RNA, a minor contaminant of the mRNA preparation. Chimeric plasmid pRR22 was shown to contain the B880 genes by hybrid selection of B880 holochrome-mRNA. We report a restriction map of its 2.2-kilobase insert and the sequence of a 430 base pair fragment thereof. Genes alpha and beta are nearly contiguous, indicating that they are transcribed as a single operon. The predicted amino acid sequences coincide with the sequences of the alpha and beta polypeptides established in other laboratories, except for additional C-terminal tails of 10 and 13 amino acid residues, respectively. We suggest that these tail sequences may serve, during membrane assembly, to give these intrinsic membrane proteins their peculiar orientation with their C-terminus facing the periplasm and their N terminus facing the cytoplasm. Intraspecific sequence homology between the alpha and beta genes of R. rubrum is low, showing no evolutionary relatedness. This is in contrast to the high interspecific homology between the corresponding sequences of Rs. rubrum and Rhodopseudomonas capsulata B880 genes.

of membrane digestions with proteinase K, it has been concluded that the N-terminal sequences of both polypeptides protrude from the cytoplasmic side of the intracytoplasmic membranes and that their C-termini are oriented towards the periplasmic side (7).
Our approach to the characterization of the genes encoding the photosynthetic machinery of Rs. rubrum has relied on the purification (8) and use as a molecular probe of B880 holochrome mRNA. With this probe, we have selected a 5.1-kb' Hind111 fragment of Rs. rubrum genomic DNA and have cloned it into Escherichia coli C600 with plasmid pBR327 (9) as a vector. This insert contains the two B880 structural genes, the photoreaction center L subunit gene and part of the M gene. A 2.2-kb fragment was subcloned and its sequence was determined in part. The sequences of the B880 a and /3 genes are only poorly homologous with one another. However, there is high interspecific sequence homology between the BSSO a genes and between the /3 genes of Rs. rubrum and Rp. capsulata. Their organization, similar to that of Rp. capsulata (10, ll), confirms our suggestion, based on in vitro translational patterns of holochrome-mRNA, that gene 0 precedes gene a in transcriptional order? An interesting finding is that, in Rs. rubrum, the predicted sequence of gene products a and /3 have tailing sequences at their C-terminal ends. Since these carboxyl-terminal sequences are not found in Rs. rubrum B880 holochrome polypeptides, their processing probably plays a role in membrane assembly.
Preparation of DNA-Plasmid DNA was amplified by addition of 100 pg/ml of chloroamphenicol to logarithmically growing cells of E. coli C600. Extraction of plasmid DNA was followed by centrifugation to equilibrium in ethidium bromide/CsCl gradients (14, 15). Chromsomol DNA was extracted from Rs. rubrum cells (5 g fresh weight) by a modification of the method of Blin and Stafford (16). In this The abbreviations used are: kb, kilobase; PIPES, 1,4-piperazine-* G. BBlanger, J. BBrard, and G. Gingras (1985) Eur. J. Biochem., diethanesulfonic acid.
in press. a2 modification, 150 pg/ml of proteinase K was used and the solution was replenished in enzyme after 1.5 h of incubation. After extraction with phenol/chloroform, nucleic acid was recovered by ethanol precipitation. The pellet (14 mg of nucleic acid) was treated with pancreatic DNase-free RNase. The yield of DNA was 6.6 mg.
Selection of Restriction Fragments by DNA-RNA Hybrdization-Samples of purified DNA were subjected to a 24-h digestion with HindIII restriction endonuclease and electrophoresed on horizontal 0.7% agarose gels in 89 mM Tris-HCl, 89 mM boric acid, 2 mM EDTA (pH 8.0). A mixture of EcoRI and Hid11 fragments of phage X were used as molecular weight markers. The DNA was transferred to nitrocellulose and hybridized (17) to 3aP-labeled probe. The probe was a Rs. r~r~r n h o l~~m e -m R~A fraction purified by molecular sieve filtration followed by sucrose gradient density centrifugation? The RNA probe (2 pg) was 5'-dephosphorylated by treatment with bacterial alkaline phosphatase and labeled with [+y-=P]ATP using a phage T4 p o I~u c l~t i~ kinase. Autoradiography was at -70°C on Kodak XAR 5 film with Dupont Cronex intensifying screens.
Cloning Procedures-Plasmid pBR327 was digested with endonuclease HindIII and treated with bacterial alkaline phosphatase. Genomic DNA fragments of size corresponding to those selected by RNA-DNA hybridization were isolated by electroelution from agarose gels (18,lQ). The eluted DNA was dialyzed against 20 mM Tris-HC1 (pH 7.3), 200 mM NaC1, 1 mM EDTA, concentrated and freed of agarose c o n~i n a n t s by filtration on 0.45-pm cellulose acetate filters and by subsequent chromatography on Elutip-D columns. The pooled DNA fragments were ligated to pBR327 with T4 DNA ligase.
Transformation of E. coli C600 was by the CaC12 procedure of Mandel and Higa (20). Recombinants were selected among the am-pici~in-r~istant transformants by looking for tetracycline-sensitive colonies on LB plates. Recombinant colonies were transferred onto Colony/Plaque Screen (New England Nuclear) disks and their DNA was denatured (21,22 (23). Hybridizations were carried out in 100 pl of 65% formamide, 10 mM PIPES (pH 6.4),0.4 M NaCl with 30 pg of 32pholochrome-mRNA purified according to BBlanger et a12 The hybridization temperature was varied from 48 to 37 "C with 20 min of incubation to each 1°C step, After translation of the selected mRNA, the translation products were immunoprecipi~ted and analyzed by electrophoresis on sodium dodecyl sulfate-polyacrylamide geis (24). DNA Sequencing-This was according to the Maxam and Gilbert chemical method (25).

RESULTS AND DISCUSSION
The probe that we used, an enriched holochrome-mRNA fraction, probably also contains other RNA species such as fragments of 23 S and 16 S ribosomal RNA (8). Since the bacterial genome contains several copies of the genes coding for ribosomal RNA (26)) the use of this probe heavily skewed the procedure towards selection of ribosomal RNA clones. Taking this difficulty into account, we adopted the following cloning strategy: 1) assess the size of those HindIII fragments that hybridize with the holochrome-mRNA fraction, 2) elute the corresponding bands from the agarose gel and transform E. coli C600 with these fragments using plasmid pBR327 as a vector, 3) select the transformants, first by their resistance to ampicillin and their sensitivity to tetracycline and then by colony hybridization with holochrome-mRNA, 4) counterselect these transformants by colony hybridization with ribosomal RNA, 5) authenticate the selected recombinants by hybrid selection of mRNA, and 6) sequence the gene.

Molecular Cloning of the B880 Holochrome Gene-& rub-
rum genomic DNA, digested to completion with endonuclease HindIII and electrophoresed on agarose gel, was transferred to nitrocellulose and hybridized with the 32P-labeled holochrome-mRNA fraction. The bands with a positive response were of about 2.2, 3.4, 5.1, and 23 kb. Corresponding but slightly wider slices were cut out of preparative gels, electroeluted, and inserted into the HindIII site of plasmid pBR327. Only the 5.1-kb band gave rise to transformants that hybridized with the holochrome-mRNA fraction but not with ribosomal RNA. A restriction map showed the inserts of two corresponding p l a s m i~, pRR51 and pRR51', to have opposite orientations in the plasmid Subcloning of a Region Containing the Holochrome Genes-To facilitate the subsequent experiments, we next attempted to localize the holochrome genes on the 5.1-kb fragment by hybridization with purified holochrome-mRNA. Hybridization was carried out on nitrocellulose blots of BumHI and PstI restriction fragments. Positive hybridization was obtained with a 2.2-kb HindIII-PstI subfragment and more precisely within BamHI and PstI sites on this subfragment. Excision from pRR5l with endonucleases EcoRI and PstI yielded the 2.2-kb fragment with a %-base pair appendix taken from between the HindIII and EcoRI sites of pBR327.
This fragment of mixed origin was inserted between the EcoRI and PstI sites of pIasmid pBR327 to form the chimeric plasmid pRR22 which was subcloned in E. coli. Fig. 1 shows the restriction map of the 2.2-kb insert.
Hybrid Selection of mRNA-As a verification that the 2.2kb insert carried the gene encoding the B880 holochrome, we performed a selection of mRNA (23) using plasmid pRR22 for affinity binding. In this experiment, the plasmid was bound to nitrocellulose and incubated with holochrome-mRNA pur%& as described elsewhere? The mRNA was eluted from the solid support and was used to program an E. coli S-30 cell-free translation system into incorporating L- [3H]leucine. The translation products were precipitated either with acetone or with antibodies directed against the holochrome polypeptides. The precipitates were analyzed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis and by f l u o r o~p h y as described elsewhere? As expected, plasmid pBR327 alone does not retain any mRNA the translation products are the same whether translation is directed by S-30 endogenous mRNA (Fig. 2, lane A ) or by purified holochrome-mRNA "selected" by plasmid pBR327 (Fig. 2, lane  B). Neither lanes A nor B contain the a and j 3 polypeptides that can be made under the direction of purified holochrome-mRNA as illustrated as a control in Fig. 2 were precipitated, respectively, with specific antibodies or with acetone (more details in legend to Fig. 2 32P-labeled and used as a hybridization probe on the restriction fragments of Rs. rubrum genomic DNA. As shown in Fig.   3, the restriction fragments obtained with HindIII, BamHI, Sun, SphI, and PuuI gave single hybridization bands ranging in size from 3 to 20 kb. We conclude that the gene is present in a single form. Sequence Analysis of the B880 Holochrome Gene-The sequence was determined by the chemical method of Maxam and Gilbert (25). The arrows at the bottom of Fig. 1 indicate the sequencing strategy. The sequence of Fig. 4 was deter-FIG. 3. Search for different forms of the B88O holochrome gene in Rs. rubrum. Genomic DNA was limit-digested with restriction endonucleases HindIII, BamHI, SalI, SphI, and PouI. The digests were electrophoresed on 0.7% agarose gel and transferred to nitrocellulose. Hybridization was with a 268-nucleotide fragment cut out of plasmid pRR22 by means of restriction endonuclease BamHI and XhoI and 32P-labeled at their 5' terminus. mined on both strands over its entire length. It contains two open reading frames, each beginning by an AUG codon and terminating by a UAA stop codon. Putative structural genes @ and cy are respectively preceded by a space of 11 and 10 nucleotides center-to-center between the predicted Shine-Dalgarno (27) ri~some-binding site of the mRNA sequence AGGAGG) and the AUG start codon. The termination signal of gene @ overlaps with the ribosome-binding site of gene a. As shown in Fig. 5, this sequence is complementary to the 3' terminal sequence from 16 S rRNA in both Rs, rubrum and E. coli (27,28). The precedence of gene 8, and perhaps also a weaker homology of the gene a ribosomebinding site, may explain the higher translation rates observed in E. coli S-30 ceil-free translation systems programmed with Rs. rubrum holochrome-mRNA.2 The short distance between genes CY and 8 and our observation (not shown) that a BamHI to PstI DNA fragment forms a stable hybrid with the 0.62-kb holochrome-mRNA coding for both polypeptides cy and @ indicate that the two genes are transcribed as an operon. Preliminary sequence work shows that the AUG start codon of the gene encoding the L subunit of the photoreaction center is at some 112 nucleotides downstream from the UAA termination codon of gene cy. This resembles the situation found in Rp. c a p s~t a (10,ll). Fig. 4 also gives the predicted amino acid sequence of the gene product, using the known sequence of the holochrome polypeptides (4-6) to determine the reading phase. The two sequences coincide for such a long stretch as to exclude an erroneous reading frame assignment. However, there are differences ( u~e r l i~~ in Fig. 4): the sequence of putative gene @ predicts alanine for the N-terminal residue instead of glutamic acid, as reported (6). This difference may perhaps be related to these authors' 50% recovery of the N-terminal residue and to their finding phenylthiohydantoin derivatives eluting close to alanine and leucine along with the partially liberated N-terminal amino acid. However, the other major difference, consisting of 13 and 10 residue stretches, respectively, at the C-terminal ends of the predicted @ and CY poiypeptides must receive a completely different sort of explanation (see below).
Codon Usage- Table I shows the codon usage in genes a: and 8. Comparison with the codon usage in the ribulosec phosphate carboxylase gene (1401 base pair) (29) shows some interesting similarities. Some codons are not used in both ribulo~-~phosphate carboxylase and B880: GCA for alanine, ATA for isoleucine, CCT and CCA for proline, ACA for threonine, and GTA for valine. Leu is coded for predominantly by CGT (12/19 in B880 and 22/33 in ribulose-diphosphate carboxylase), neither gene making use of codons CTA and TTA. The G + C content of the coding sequence of B880 is 57.8% compared to 63.7% in the whole genomic DNA (30) and 65.1% in ribulose-diphosphate carboxylase.
Sequence Homology-To detect possible sequence homology between the genes encoding the a: and the /3 subunits of the B880 holochrome of Rs. rllbrum, the corresponding coding GTT 4 GTC 4 GTA 0 GTG 2 END" TAA 2 TAG 0 TGA 0 END, termination codon. stretches were compared using the dot matrix program of Zweig (31). Only weak homology was found, indicating that the two genes are probably not structurally or phylogenetically related (results not shown). Entirely different was the outcome of a comparison of the sequence data of Fig. 4 with the corresponding region of the sequence published by Youvan et al. (11) for Rp. capsuluta. A high interspecific homology was found between the coding regions of genes on the one hand and @ on the other (Fig. 6). For this plot, a window of 15 nucleotides was used with a stringency of 12/15, stringency being defined as the number of matches within the window for printing a dot in the homology search. The homology between genes @ is high over most of their length and is strictly in phase, indicating e v o~u t i o n~ relatedness. There is a break in homology between genes B and a, corresponding to a 63-base pair insertion in the 6 gene of Rs. rubrum with respect to its Rp. capsuluta counterpart. This also causes a displacement of the diagonal line. Fig. 6 also shows a high homology between the CY genes of the two organisms. Such high homology between the B880 genes of these two species is in line with the general sequence homology found by inter-specific cross-hybridi~tion experiments carried out among different photosynthetic bacteria (32). Fig. 7 shows the alignment of the B880 gene sequences of Rs. rubrum and Rp. ccy>Sulata according to maximum homology. The predicted amino acid sequences also show high homology: in Fig. 7, regions of perfect amino acid sequence homology are boxed. Some regions of imperfect amino acid homology, but where the substituted residues are expected to fulfill similar functions {Leu/Ile, Asp/Glu, ThrlSer, LyslArg) are indicated by dashed lines.
S~~~~n on the Role of the C -t e r m~~ ~x t e~~~-S i n c e the 10 to 13 amino acid residue stretches at the C-terminal ends of the predicted a and @ sequences were not found in the ~lypeptides extracted from Rs. r~r~~ (Fig. 4) and, since the B880 structural gene is present in a single form (Fig. 3), we have to suppose that the gene producta are modified posttranslationally. These unusual predicted C-terminal sequences may perhaps be related to the rather unique orientation of the B880 holochrome polypeptides with their N termini protruding in the cytoplasmic compartment and their C termini probably facing the periplasm (7). Other intrinsic membrane proteins in the prokaryotes or in the eukaryotes are all found to be oriented with their C termini facing the cytoplasm and their N termini facing the periplasm (33)(34)(35)(36)(37)(38)(39)(40)(41)(42). Band I11 protein of the erythrocyte membrane has both its N and C termini facing the cytoplasm (43). As pointed out by B~n i a c h o~z et d. (6, 71, ~~y p e p t i d e s a and @ contain three regions: 1) a polar/cha~ed N-terminal region of 12 (for a) and 20 (for B) residues, 2) a hydrophobic region of 21 (for a) and 23 (for 8) residues with the proper length to span once the 32-A hydrophobic layer of the membrane and 3) a C-terminal polar region. This last region would be even more polar and longer in the gene products predicted here and may thus be involved in membrane assembly. This suggestion is all the more tempting in the absence of a signal peptide (4445) whose function of solubilization and anchorage in the h y~p h o b i c phase of the membrane is probably filled by the hy~ophobic stretch of 21 to 23 residues. Two main scenarios of membrane assembly may be envisaged according to whether it is assumed to be post-or cotranslational. In the first case, the trigger signal mechanism (37, 38) might be invoked. Accordingly, the C-terminal sequence, together with the polar N-terminal region, might insure the solubility of the proteins in the cytoplasm. Processing of the predicted extension by a cytoplasmic peptidase might cause a solubility and/or conformational change that would trigger membrane integration of the proteins. In this respect, there is evidence for a conformational change in the BS80 holochrome proteins of Rp. c~s~t a after contact with the membrane f46).
In a cotransiation~ assembly m~~a n i s m s , the poiar/ charged N terminus would stay at the membrane-to-cytoplasm interface while the h y~p h o b i c stretch would penetrate the hydrophobic interior. The C-terminal polar tail would be pulled through the membrane by the free energy decrease afforded by the interaction of the hydrophobic domains of the proteins and of the membrane. A further thermodynamic pull might be afforded by the proteolysis of the C-terminal extension, perhaps in the periplasmic compartment. A c o t r~~a t i o n~ m~h a n~m might facilitate the concerted assembly of the a and @ polypeptides with each other and with the L and M subunita of the pho~reaction center whose genes are in sequential order with the B880 genes (10, 11). The H subunit may also play a pivotal role in this assembly (47,s).
The only other known instance of carboxyl-terminal processing of a membrane polypeptide is that of the 32-kDa @protein of chloroplast. In that case, membrane assembly appears to be posttranslational and is thought to occur via the membrane trigger mechanism (49).