The Primary Structure of a Halorhodopsin from Natronobacterium pharaonis STRUCTURAL, FUNCTIONAL AND EVOLUTIONARY IMPLICATIONS FOR BACTERIAL RHODOPSINS AND HALORHODOPSINS*

Comparison the bacteriorho- of their

Conserved and conservatively replaced amino acid residues in all three proteins identify general features essential for ion-motive bacterial rhodopsins, responsible for overall structure and chromophore properties. Comparison of the bacteriorhodopsin sequence with those of the two halorhodopsins, on the other hand, identifies features involved in their specific (proton and chloride ion) transport functions.
The bacterial rhodopsins comprise a family of small retinalcontaining proteins in the extremely halophilic bacteria, providing light-dependent ion transport and sensory functions for these organisms. Since the discovery of bacteriorhodopsin and halorhodopsin, the list of bacterial rhodopsins has grown considerably. In addition to bacteriorhodopsin (reviewed by  Stoeckenius and Bogomolni, 1982;Lanyi, 1984), which is a light-driven proton pump in H&bacterium halobium, another pigment with the same transport function, named archaerhodopsin (Mukohata et al., 1988), was recently found in a halobacterium from Australia. A second halorhodopsin, in Natronobacterium pharaonis (Bivin and Stoeckenius, 1986), which is a light-driven chloride pump similar to halorhodopsin in H. halobium (Lanyi, 1986) has been identified also. A sensory rhodopsin, serving both photoattractant (in the red) and photophobic (in the near UV) responses, was found in H. halobium (Spudich and Bogomolni, 1988), while another sensory rhodopsin, for photophobic response (in the blue) is present not only in H. halobium (Spudich and Bogomolni, 1988) but probably also in N. phuraonis (Bivin and Stoeckenius, 1986).
The structural genes for the polypeptides of bacteriorhodopsin (Dunn et al., 1981) and halorhodopsin (Blanck and Oesterhelt, 1987) have been cloned and sequenced, and study of the structural and functional consequences of mutational alterations in bacteriorhodopsin has begun already. In this context, comparisons of the amino acid sequences between naturally occurring bacterial rhodopsins are interesting and may be expected to reveal: (a) common residues, needed for enclosing the retinal in an environment which provides a large color shift (Spudich et al., 1986;Lanyi et al., 1988) and allows its specific reversible all-tram to 13-&s isomerization upon illumination, and (b) distinct residues, needed for the light-activated ion translocations and sensory functions. Comparison of the flanking DNA sequences is also important, since the way transcription is initiated and terminated in the archaebacteria is only beginning to be understood (DasSarma et al., 1984;Zillig et al., 1988;Thomm and With, 1988;Suillard et al., 1988;Shimmin and Dennis, 1989).
N. pharaonis, as other natronobacteria and natronococci, was isolated from soda lakes in North Africa, whose pH is 10.5-11 (Tindall et al., 1980;Soliman and Triiper, 1982). These organisms are clearly extreme halophiles: they contain the typical archaebacterial lipids (Tindall et al., 1984) and grow optimally at 4 M NaCl like halobacteria, but at pH 9.5 rather than 7. They are only distantly related to the halobacteria, as indicated by a significant difference in the GC content of their DNAs, and low homology in DNA/DNA, as well as DNA/16 S RNA cross-hybridizations (Tindall et al., 1984). The finding of retinal pigments in N. phnraonis is therefore 1254 Structure of Pharaonis Halorhodopsin an important extension of the otherwise restricted list of organisms which harbor these proteins. The halorhodopsin from N. pharuonis, which we propose here to name pharaonis halorhodopsin,' was first detected by flash spectroscopy in membrane preparations from this organism (Bivin and Stoeckenius, 1986). In these studies the flash-induced difference spectra demonstrated the presence of two retinal-dependent photoreactive pigments, one with an absorption maximum near 580 nm and a cycling half-time of 2 ms, and a second one with an absorption maximum near 500 nm and a cycling half-time of 0.5 s. The spectra of the transient photoproducts for the former generally agreed with those of halorhodopsin, and the kinetics of their interconversions were affected by chloride. In our experience also (Duschl et al.,I990), the purified pigment shows similar, although not identical, photoreactions as halorhodopsin from H. halobium. Additionally, N. pharaonis cells exhibited passive light-dependent proton influx (Bivin and Stoeckenius, 1986), similar to that which in H. halobium originates with halorhodopsin.
We have confirmed these results on transport with envelope vesicles prepared from IV. pharaonis (Duschl et al., 1990). The second, slowly cycling pigment in N. pharaonis detected, on the other hand, bears resemblance to the blue-light absorbing sensory rhodopsin in H. halobium.
We have purified the pharaonis halorhodopsin and describe its spectroscopic and photochemical properties elsewhere (Duschl et al., 1990). As part of our efforts to describe this system, and in order to make comparisons with the other known bacterial rhodopsins, we report here on the cloning and sequencing of the structural gene for this protein. The open reading frame was confirmed to code for the pharaonis halorhodopsin protein by peptide sequencing and immunoreactions.

MATERIALS AND METHODS
Cloning and Sequencing-N. phuruonk (strains SP-1W and SP-1, generous gifts from W. Stoeckenius (University of California, San Francisco) and W. D. Grant (University of Leicester, United Kingdom) respectively) was grown at 37 "C with aeration, in a medium containing the following per liter: 200 g of NaCl, 50 g of NazCOs. 10HzO, 10 g of yeast extract (Difco), 7.5 g of casamino acids (Sigma), 3 g of Nascitrate . 2H20, 2 g of KCl, 1 g of MgSO,. 7Hz0, 50 mg of FeSOl. 7H20,0.36 mg of MnC12 .4H,O? Genomic DNA was prepared according to Vogelsang et al. (1983). The DNA was restricted with a number of endonucleases and probed on Southern blots with a DNA fragment bearing the hop gene from H. halobiun (Blanck and Oesterhelt, 1987), isolated from a pGEM plasmid construct with EcoRI and BamHI digestion, and radiolabeled with the random primer method (Feinberg and Vogelstein, 1983). The stringency employed was medium level: hvbridization was at 42 "C in 30% formamide, and the wash at 60 "C with 2 X SSC (SSC is 0.15 M NaCl, O.OI5 M Na3citrate, pH 7.0). Cloning of a 2.4-kilobase PstI fragment which hybridized with the probe DNA, and therefore carried the putative pharaonis hop gene, was first into the vector Bluescribe, in Escherichia coli JM105, and then into the single-strand vector, Ml3 mp19. A 1.7-kilobase fragment left after Sal1 digestion of the insert was additionally subcloned into M13mp19 (clone Fib); single-strand annealing showed that one of the original Ml3 clones (M2) contained i We suggest that as new bacterial rhodopsins are discovered, the original names which refer to their function be retained, i.e. bacteriorhodopsin for proton pumps, halorhodopsin for chloride pumps, and sensory rhodopsin for sensory pigments. These names should be either preceded by the species name when appropriate, or followed by roman numerals when the species are not identified, or more than one pigment with the same function is found in the same species. We suggest also that the genes for these proteins be called bop, hop, and sop, respectively, together with the species designations to indicate their origins.
its complementary strand. Flb contained all of the pharaonis hop gene, beginning about 50 bp3 from the universal primer sequence. Sequencing was with the Sequenase system (United States Biochemical Corp.), using first the universal primer for single strands from Flb, and subsequently primers prepared according to sequence information for single strands from Flb and M2. Both strands were sequenced, and in overlapping segments.
Purification of Pharaonis Halorhodopsin-We found that some of the published purification procedures for halobium halorhodopsin resulted in bleaching of pharaonis halorhodopsin, while others did not yield a pure pigment. Pharaonis halorhodopsin was therefore purified by a modified procedure recently developed by us, and described elsewhere (Duschl et al., 1990). Briefly, it consists of Tween 20 extraction of N. phuraonis membranes, solubilization with cholate in 4 M NaCl, followed by chromatography on phenyl-Sepharose where elution was with Lubrol PX, and repeated FPLC chromatography on hydroxylapatite where elution was with a phosphate-containing Lubrol PX buffer. The nrenaration was highlv nure: the 280/570 nm absorption ratio was i.48, which is equivalent to our purest 'halorhodopsin preparations (Duschl et al., 1988). On SDS-acrylamide gels it ran as a single monomeric band, slightly above the position of halorhodopsin, but contained an additional band which we attribute to oligomeric pharaonis halorhodopsin (not shown, cf. also Vogelsang-Wenke et al., 1986).
Peptide Sequencing-For CNBr-cleavage, 245 pg of pharaonis halorhodopsin was precipitated with cold acetone, washed with distilled water, and incubated for 17 h at room temperature in the dark with 1.6 ml of 10% CNBr in 70% formic acid. The reaction was stopped by adding 20 ml of water and the solution was concentrated in a rotary evaporator, with repeated additions of 70% formic acid, to about 0.5 ml. To this residue 1.5 ml of water was added, and the mixture injected on a HPLC system (Pharmacia LKB Biotechnology Inc.) equipped with an Aquapore RP-300 7-pm column (30 x 2.1 mm, Applied Biosystems). Peptides were eluted by a O-80% acetonitrile gradient, containing 0.1% trifluoroacetic acid. Amino acid sequencing of these was performed on an Applied Biosystems 470A gas phase sequencer. The phenylthiohydantoin-derivatives were separated on an isocratic HPLC system as described (Lottspeich, 1985;Eckerskorn et al, 1988). Immunological Studies-Four peptides, labeled A-D, derived from the halobium hop and putative pharaonis hop sequences as described below, were made on an Applied Biosystems model 340A peptide synthesizer (May et al., 1988). For coupling to keyhole limpet hemocyanin, peptide B contained an extra cysteine residue at its NH, terminus. Coupling was via the sulfhydryl group, with m-maleimidobenzoic acid-N-hydroxysuccinimide ester (Kitagawa and Aikawa, 1976;Liu et al., 1979). Antisera were raised by injecting rabbits with either 120 pg of keyhole limpet hemocyanine coupled peptide B, or 80 pg of pharaonis halorhodopsin, and repeating the injection 4 weeks later. Blood was collected 10 days after the second injection. For the ELISA assays 200 ~1 of antigen solutions (100 pg/ml free synthetic peptide in phosphate-buffered saline (0.14 M NaCl, 2.7 mM KCl, 1.5 mM KH2POI, 8.1 mM Na2HP0,, pH 7.2)) were treated in Costar microtiter plates according to a procedure described before (May et al., 1988), and reacted with anti-pharaonis halorhodopsin serum. Immunoblots were produced after electrophoresing 5-20 rg of bacteriorhodopsin, halorhodopsin, and pharaonis halorohodopsin on denaturing SDS-acrylamide gels, and transfer to nitrocellulose (Schleicher & Schuell) for 30 min at 150 mA in 25 mM Tris, 192 mM glycine, pH 8.3, 20% methanol, 0.04% SDS. The filter was washed with nhosuhate-buffered saline containine 0.05% Tween 20 and ins-~1 cubated overnight with anti-peptide B antibody (1:500 in the same buffer). Bound antibody was detected with '?-labeled sheep antirabbit Ig antibody (Amersham International, Great Britain; lo7 cpm/ ml).

RESULTS
The nucleotide and amino acid sequences of the pharaonis hop gene and its flanking regions are shown in Fig. 1. The location of the start codon was assumed to be the ATG which preceeds the first putative transmembrane helical segment 3 The abbreviations used are: bp, base pair(s); FPLC, fast protein liquid chromatography; SDS, sodium dodecyl sulfate; HPLC, high pressure liquid chromatography; ELISA, enzyme-linked immunosorbent assay. (cf below), as in halobium hop. This is reasonable, because pharaonis halorhodopsin has a similar molecular weight as halobium halorhodopsin.
However, since Edman degradation of the intact protein detected no NH* terminus, the start of the translated sequence as well as the NH, terminus of the processed pharaonis halorhodopsin are not known at this time. Peptide sequences corresponding to the three underlined (solid line) segments were obtained from CNBr cleavage fragments prepared from purified pharaonis halorhodopsin.
These confirmed that the open reading frame in Fig. 1, cloned with halobium hop as probe, codes for the pharaonis halorhodopsin polypeptide.
The first sequence, MPAGHFAEGSSVMLGG-EEVDG, was derived from a partial cleavage product (i.e. it contained a methionine residue), and was a particularly valuable sequence for identification since it corresponds to a poorly conserved, interhelical region of bacterial rhodopsins (cf below). Another method for identifying the cloned sequence as the structural gene for the pharaonis halorhodopsin polypeptide was by immunoreactions. Antibodies were raised against pharaonis halorhodopsin, and these were tested, in ELISA assays at different dilutions, against two synthetic peptides constructed from the putative carboxyl-terminal sequence of the supposed pharaonis hop gene in Fig. 1, (peptide A: YLTSNESVVSG; and peptide B: GSILDVPS), and two peptides from sequences in the halobium hop gene, (peptide C: AGQTLGTMSS, from the carboxyl-terminal region; and peptide D: AVRENALLS, from the amino-terminal region). As shown in Fig. 2, the peptides from halobium hop (peptides C and D) reacted poorly or not at all with the pharaonis halorhodopsin antibodies, but the peptides from pharaonis hop, particularly peptide B, which corresponds to a segment nearer to the carboxyl terminus than A (Fig. l), reacted strongly, as expected if these sequences are contained in pharaonis halorhodopsin.
Conversely, antibodies raised against peptide B were tested against blots of purified pharaonis halorhodopsin, halorhodopsin, and bacteriorhodopsin (not shown). Reaction was seen only with a major band containing monomeric pharaonis halorhodopsin and a minor band containing an oligomeric form of this protein. Fig. 3 shows flanking sequences from halobium bop and hop, and pharaonis hop, aligned with respect to starting and termination codons. Comparison of these sequences, with respect to features which might play initiation and termination roles, will be given below.
The derived amino acid sequence for pharaonis halorhodopsin could be aligned readily by eye with the bacteriorhodopsin and halorhodopsin sequences from H. halobium ( Fig.   4; identical residues are indicated with dots above the sequences). In the alignment no gaps were allowed in sequences corresponding to the putative intramembrane helical regions in halobium bop and hop; i.e. the helical assignment was assumed to be valid for pharaonis hop as well. With this method of alignment a total of 57 residues were identical, and 38 conservatively replaced, among the three sequences. Computer alignment (e.g. with the Dayhoff algorithm) produced a slightly better score (61 identities and 34 conservative replacements), but with many short gaps throughout the entire sequence, including the putative helices (not shown). We prefer the alignment in Fig. 4, where, in contrast, all the gaps are located in interhelical regions, and with the exception of long single gaps for halobium bop and hop in the B-C interhelical region, they are few and short. Other than trivial alternatives for the exact placing of some of the gaps, the alignment is unambiguous. The large number of identical residues in all three proteins, and the numerous identities of residues in any two of the three proteins, clearly indicate that these are closely related sequences.

Transcription
Initiation Signals-Of the gene flanking sequences shown in Fig. 3, the initiation of transcription (shown with asterisks over them) was determined for bop (DasSarma et al., 1984) and hop4 in H. halobium. In the absence of such information for pharaonis hop, we assume that, as usual in archaebacteria, initiation in this case is also near the ATG codon (underlined) which initiates translation. Comparison of the three sequences in Fig. 3 Dennis (1985), and DasSarma et al. (1984), GAGTTA, beginning at -34 in halobium bop, is a possible consensus sequence for these genes. In halobium hop a similar sequence, GAGGTT, begins at position -36 (Blanck and Oesterhelt, 1987); in pharaonis hop the corresponding sequence is GATTTC at -34 (these sequences in Fig. 3 (Shimmin and Dennis, 1989). The sequences in Fig. 3 Fig. 3 are shown in bold).
Transcription Termination Signals-DasSarma et al. (1984) found that the 3' terminus of the halobium bop transcript was about 45 bp downstream of the translation stop codon, at the end of an RNA sequence capable of forming a short stemloop structure, such as required for rho-independent termination in E. coli (the G + C-rich regions of dyad symmetry are shown with arrows in Fig. 3). This sequence was not followed, however, as usually found in such sequences, by a run of uridine residues which would allow the release of the RNA polymerase.
Instead, it contained the heptamer DNA sequence block, CAACGGAC (shown, together with similar sequences under-and overlined in Fig. 3), with 417 matches with the E. coli rho-dependent termination consensus sequence, CAATCAA.
Significantly, this same "mixed" termination motif seems to be present in the other two genes in Fig. 3: dyad symmetries and possible consensus sequences (4/ 7 matches with the E. coli sequences) are shown as for halobium bop. The stem-loop structure shown for halobium hop in Fig. 3 is somewhat different from that suggested earlier for this gene (Blanck and Oesterhelt, 1987).
The transcript termination of some operons in E. coli is the result of a rho-independent termination site, followed after 50-100 base pairs by a rho-dependent termination site (Sameshima et al., 1989) Structure of Pharaonis Halorhodopsin arisen first, followed by the evolution of an upstream stemloop structure in the RNA transcript, to protect it from 3'exonucleolytic degradation. Eventually, the E. coli RNA polymerase would recognize the stem-loop structure as a factorindependent transcriptional pause site, and the ensuing string of uridine residues then evolved to facilitate the release of the RNA polymerase, and the transcript, from the DNA template. If this is so, the archaebacterial genes in Fig. 3 might represent a primitive stage of evolution, which E. coli has already passed: they employ a consensus sequence for factor-dependent termination, and a stem-loop structure to stabilize the RNA transcript, but the absence of uridine residues suggests that the RNA polymerase does not yet use the stem-loop structure as a termination signal. However, genes in nitrogen-fixing archaebact,eria, for example, contain both stem-loop structures and runs of uridine residues (Suillard et al., 1988).
General Architecture of Zon-motive Bacterial Rhodopsins-An important question in comparing the halobium bop and hop, and the pharaonis hop sequences is their evolutionary distance: the identities and conservative replacements of residues in Fig. 4 could originate either from insufficient time to produce divergence (at the rate these genes evolve) or from the fixing of mutations which had survived selection pressures. Analysis of the DNA sequences has shown that these genes have had a chance to undergo very significant alterations since their branching (l-l.5 mutational events/nucleotide on the average, cf. below). A simple calculation indicates that for 1 event/nucleotide the probability for conserving a codon by chance would be less than 4% without wobble (for Trp and Met), and even for codons with full wobble in one position (e.g. for Thr, Ser, Ala, Leu, Pro) it would be only about 10%. Thus, the conserved residues in these sequences would appear to represent the outcome of significant selection. The identities, and to a lesser extent the conservative replacements, in the three amino acid sequences should therefore provide information on those residues essential for maintaining tertiary structure and the chromophore in the three pigments, as well as features responsible for processes not readily apparent in the finished proteins, such as insertion into the membrane, directing folding pathways, etc. The differences between halobium bop and the two hop sequences should reveal, on the other hand, those residues more important for the specific ion transport mechanisms. Fig. 5 shows conserved and conservatively replaced residues within putative helical regions in the three sequences. Arguments for the correspondance of the secondary structures of bacteriorhodopsin and halorhodopsin were given in Blanck and Oesterhelt (1987) and Lanyi et al. (1988). Additionally, reconstitution of halorhodopsin into membrane-type structures and optical diffraction data of their electron micro-graphs5 gave dimensions similar to those of bacteriorhodopsin. The following points can be made from Fig. 5; (a) Helices C, F, and G contain the most conserved residues, consistently with the suggestion (Blanck and Oesterhelt, 1987) that it is mainly these helices which form the retinal-binding cavity. (b) Besides the retinal-bearing lysine residue on helix G, only 2 acidic and 3 basic residues are conserved within the putative helices. These consist of an aspartate (DDD) on helix G which may serve as the counter-ion to the Schiff base, DDD on helix D and RRK on helix F, which may influence the chromophore absorption maximum by interacting with its ionone ring (Spudich et al., 1986;Lanyi et al., 1988), and 2 arginine residues (RRR and RRR) near the extracellular surface on helices C and E. Conservation of the aspartate at the middle of helix D, but not of the two aspartates in bacteriorhodopsin on helix C, is consistent with the suggested involvement of the former in determining the spectrum of the chromophore but not in proton transport in bacteriorhodopsin, and the role of the latter two in the transport but not in the chromophore (Marinetti et al., 1989;Butt et al., 1989;Soppa et al., 1989). (c) The tyrosine residue on helix F (YYY) is conserved. This residue  in bacteriorhodopsin) was recently suggested to transiently gain a proton during the photocycle of bacteriorhodopsin, since changing it into a phenylalanine eliminated changes in absorbance UV (Ah1 et al., 1988) andFTIR bands (Braiman et al., 1988), attributed to a change in the protonation state of a tyrosinate.
Another conserved tyrosine (YYY on helix B) corresponds to Tyr-57 in bacteriorhodopsin.
Its replacement with phenylalanine (Mogi et al., 1987) or asparagine (Soppa et al., 1989) changed the light/dark adaptation properties of bacteriorhodopsin, and in the latter case abolished transport activity. (d) Three proline residues, near the center of helices B and C, and close to the extracellular end of helix F, are conserved. Alteration of the proline on helix F in bacteriorhodopsin has important effects on the chromophore and the photocycle (Ah1 et al., 1988), but it can be replaced with small volume residues, such as glycine. (e) Four tryptophans, two on helix C and two on F, are conserved in all three proteins and might participate in complexation of the retinal moiety. Close contact of these residues and retinal was suggested from exciton coupling between these species (Polland et al., 1986). Mutations of 2 unconserved tryptophans, Trp-138 to arginine (Soppa et al., 1989) and Trp-137 to phenylalanine (Khorana, 1988), caused blue-shifts in the absorption spectrum of bacteriorhodopsin, but it is not clear whether these were caused by direct interaction with the retinal or by other structural effects. (f) There is a large number of conserved hydrophobic residues throughout the putative intramembrane region. Many of these are specifically retained as either aromatic or aliphatic residues, and only at a few locations, mainly on helix G, are aromatic residues replaced by aliphatics or vice versa (not shown in Fig. 5, but cf. Fig. 4) Only identical and conservatively replaced residues are shown, and only those assumed to be within the trans-C membrane helical regions. At each position 3 residues are given, in the following order: bacteriorhodopsin, halobium halorhodopsin, pharaonis halorhodopsin.
helix E (corresponding to Arg-134 in bacteriorhodopsin), have been suggested to play a role in transport, i.e. to constitute sites for chloride binding in halorhodopsin (Lanyi et al., 1988). However, the interfacial Arg residue on helix C may have an effect on the anion specificity of the transport (Duschl et al., 1990). (b) The finding of similarity between the A-B interhelix loop segments in halobium and pharaonis hop (not shown in Fig. 7, but see Fig. 4), containing several basic residues and prolines (within 9 residues, 4 arginines and 2 prolines in halobium hop, and 2 arginines, a lysine, and a proline in pharaonis hop), is probably also meaningful for chloride translocation, consistently with a proposed role of this segment in chloride release (Lanyi et al., 1988). However, the net charge of the A-B interhelical segment is +4 in halorhodopsin, but only +l in pharaonis halorhodopsin, and it is possible that this is the location which influences the anion specificity in these systems (Duschl et al., 1990). (c) As discussed above, two aspartate residues on helix C, thought to play critical roles in proton transport by bacteriorhodopsin, are missing in the halorhodopsins.
The conserved acidic residues designated DE and ED, at the end of helix E and the beginning of helix F, not contained in bacteriorhodopsin, are highly suggestive of the importance of a carboxyl group at these locations, but their function in the halorhodopsins is unknown. Possibly, they form ion pairs with buried arginine residues and thereby stabilize them. Other aspartate residues, on helix F in halobium hop, and at the beginning of helix E in pharaonis hop, seem not to be essential. (c) Both cysteine residues are conserved in the halorhodopsins.
Binding of Hg*+ by one or both of these residues affected the chromophore in halobium halorhodopsin, suggesting that at least 1 Cys residue is located in the vicinity of the retinal (Ariki and Lanyi, 1984).

Evolutionary
Relationships among the Ion-Motive Bacterial Rhodopsins-The evolution of bacterial rhodopsins is of great interest in the context of evolution within the halobacterial family, and the relationship of these organisms to eukaryotes. We have compared the three DNA and protein sequences to estimate the number of nucleotide and amino acid residue identities between each pair of genes. Segments containing gaps in any of the sequences were removed. The results are given in Fig. 8A  A, percent identities at the nucleotide (normal characters) and the amino acid (bold characters) level. Frequencies of identities expected for random (but aligned) pairs of sequences of the compositions of the three genes are given in parentheses.
B, replacement (normal characters) and silent (bold characters) divergences, calculated with a program according to Brown et al. (1982). A 70% frequency for transitions was assumed. characters).
The percent identities which would arise by chance, given the nucleotide and amino acid compositions, are given in parentheses.
Although the calculated numbers are somewhat biased by having first selected an alignment (Fig. 4), it is clear that at the DNA level the numbers of identities are not very different from what chance would predict (with the possible exception of the halobium hoppharaonis hop comparison).
At the protein level, however, the extents of identities are clearly meaningful.
The comparison indicates that halobium and pharaonis hop are much closer to one another than to halobium bop. However, this conclusion is biased by the fact that halorhodopsin and pharaonis halorhodopsin have the same function, and they are expected on this basis alone to be more similar to one another than to bacteriorhodopsin.
A more sophisticated comparison, which is less dependent on selection for function, can be made from DNA sequences, as described in the following.
As evolution proceeds, DNA sequences change more rapidly than protein sequences because the nucleotide code is degenerate. Furthermore, although selection at the nucleotide level (codon usage, etc.) exists also, selection at the protein level seems to be much greater. It has been demonstrated (Brown et al., 1982), in fact, that during the evolutionary divergence of protein-coding genes the large majority of nucleotide changes are of the silent type, i.e. those which do not result in amino acid residue changes. Numerical estimates of mutations at codon positions which do not change the amino acid residue, and at those which do, are provided by the silent and replacement divergences, respectively. These parameters, corrected for the frequency of multiple events, can be calculated from aligned nucleotide sequences and correspond to the average number of mutational events per nucleotide residue (Perler et al., 1980;Brown et al., 1982). Fig. 8B contains 1260 Structure of Pharaonis Halorhodopsin a table of comparisons for halobium bop and hop, and pharaonis hop (replacement divergence shown in normal characters, silent divergence in bold characters). All of the silent divergences calculated are remarkably high, amounting to l-1.5 changes/nucleotide, indicating that these DNA sequences have had a chance to change considerably from one another. This conclusion is supported by the calculated nucleotide identities in Fig. 8A, and by dot-matrix comparisons of halobium and pharaonis hop (not shown): although at the protein level the genes are clearly related to one another with this analysis also, at the DNA level no relationship can be discerned, regardless of how the comparison parameters are manipulated (in spite of the fact that enough homology exists to have made halobium hop a usable probe in the cloning of pharaonis hop). In the halobium hop-pharaonis hop comparison in Fig. 8B the replacement divergence is about 2.5 times smaller than the silent divergence, as expected from selection at the protein level against replacement. In contrast, in the halobium bop-halobium hop and halobium bop-pharaonis hop comparisons the replacement divergences are nearly as high as the silent divergences, supporting our suspicion that many of the amino acid replacements are the results of selection pressures which fix different residues due to functional differences in the proteins (cf above). Thus, we feel that meaningful evolutionary distances will be provided only by the silent divergences. This kind of analysis gives, however, a different result than the percent identities: according to Fig. 8B, it is the pharaonis hop and halobium bop sequences which are more closely related to one another.
In reconstructing evolutionary pathways, it is usual to compare genes coding for proteins of the same function. Thus, a valid evolutionary tree contains distances not dependent on the pressures which had selected the specific functions of the proteins. In the case of the bacterial rhodopsins, the distances should be estimated from the silent divergences.
Unfortunately, however, these are so high (Fig. 8B), that uncertainties in the correction for multiple events at the same nucleotide are likely to be on the same order as the differences in the divergences. It would seem that, in spite of the close analogies between these proteins, rapid evolution at the DNA level had nearly completely obliterated the features of the tree which connects them.