Molecular Cloning and Characterization of a New Member of the Gap Junction Gene Family, Connexin-31*

A new member of the connexin gene family has been identified and designated rat connexin-3 1 (Cx3 1) based on its predicted molecular mass of 30,960 dal-tons. Cx31 is 270 amino acids long and is coded for by a single copy gene. It is expressed as a 1.7-kilobase mRNA that is detected in placenta, Harderian gland, skin, and eye. Cx31 is highly conserved and can be detected in species as distantly related to rat as Xeno- pus laevis. It exhibits extensive sequence similarity to the previously identified connexins, 58, 60, and 40% amino acid identity to Cx26, Cx32, and Cx43, respec-tively. When conservation of predicted phosphorylation sites is used to adjust the alignment of Cx31 to other connexins, a unique alignment of three predicted protein kinase C phosphorylation sites near the carboxyl terminus of Cx31 with three sites at the carboxyl terminus of Cx43 is revealed. The gap junction is a structure composed of two closely apposed plasma membranes with a tightly packed array of cell to cell channels (Revel and Karnovsky, 1967). The physiology of the channels has been characterized in some detail in several experimental systems (Loewenstein, 1981; Spray and Bennett, 1985). In vertebrates, they have been shown to provide a low resistance electrical pathway between cells and to allow the passage of molecules 4 0 0 0 Da with little or no selectivity (Flagg-Newton et al., 1979). These channels are thought to have many important biological functions including

A new member of the connexin gene family has been identified and designated rat connexin-3 1 (Cx3 1) based on its predicted molecular mass of 30,960 daltons. Cx31 is 270 amino acids long and is coded for by a single copy gene. It is expressed as a 1.7-kilobase mRNA that is detected in placenta, Harderian gland, skin, and eye. Cx31 is highly conserved and can be detected in species as distantly related to rat as Xenopus laevis. It exhibits extensive sequence similarity to the previously identified connexins, 58, 60, and 40% amino acid identity to Cx26, Cx32, and Cx43, respectively. When conservation of predicted phosphorylation sites is used to adjust the alignment of Cx31 to other connexins, a unique alignment of three predicted protein kinase C phosphorylation sites near the carboxyl terminus of Cx31 with three sites at the carboxyl terminus of Cx43 is revealed.
The gap junction is a structure composed of two closely apposed plasma membranes with a tightly packed array of cell to cell channels (Revel and Karnovsky, 1967). The physiology of the channels has been characterized in some detail in several experimental systems (Loewenstein, 1981;Spray and Bennett, 1985). In vertebrates, they have been shown to provide a low resistance electrical pathway between cells and to allow the passage of molecules 4 0 0 0 Da with little or no selectivity (Flagg-Newton et al., 1979). These channels are thought to have many important biological functions including the regulation of growth control (Mehta et al., 1986), synchronization of cellular activity including synchronized contraction of myocardial cells (Barr et al., 1965), regulation of embryonic development and differentiation (Pitts, 1978), and metabolic homeostasis (Sheridan et al., 1979).
Gap junctions were identified first in the gold fish Mauthner cell by Robertson (Robertson, 1963). They have now been identified in almost every metazoan that has been examined, and they have also been described morphologically in a wide variety of tissues. This wide distribution and conservation of structure suggests that it is involved in a fundamental biological function shared by all multicellular animals.
Despite their wide distribution, gap junctions have been isolated only from a few organs in a few organisms. Hepatic gap junctions from mouse and rat were isolated, and two mouse have been shown to have a principal protein of M, 47,000 (Kensler and Goodenough, 1980;Manjunath et al., 1982 and. Gap junctions have also been isolated from rat and bovine lens, and a protein of M , 70,000 has been identified as its major component in the lens fiber cell (Kistler et al., 1988). This protein has the same amino-terminal sequence as a protein with a predicted molecular mass of 46,000 Da (Beyer et al., 1988). The relationship between the two proteins is not yet understood. Finally, gap junctions from the arthropods Nephrops and Drosophila have been isolated, and several putative protein components have been identified (Berdan and Gilula, 1988;Buultjens et al., 1988;Ryerse, 1989).
The major structural proteins of gap junctions identified by isolation are now called connexins. They are members of a gene family that was first identified on the basis of protein sequence (Nicholson et al., 1985;Nicholson et al., 1981). Several cDNAs coding for connexins now have been isolated. In rat, the three proteins for which cDNAs have been described are designated Cx26,' Cx32, and Cx43 based on their predicted molecular mass (Beyer et al., 1987;Paul, 1986;Zhang and Nicholson, 1989). These cDNAs correspond to the hepatic M, 21,000 and M , 28,000 proteins and the cardiac M, 47,000 protein, respectively. Homologues to these connexins from several other species have been isolated also (Gimlich et al., 1990;Lash et al., 1990;Musil et al., 1990). Another member of the connexin gene family expressed in early Xenopus development, Cx38, has also been identified (Ebihara et al., 1989). Through the use of the cDNAs as probes, the distribution of the various connexins has been described in several tissues and cell lines (Beyer et al., 1987;Crow, et al., 1990;Larson et al., 1990;Musil et al., 1990;Zhang and Nicholson, 1989). All connexins identified to date have unique distributions. Some tissues and organs have a single connexin while others have more than one, but no two are always found together. In rodent hepatocytes it appears that a single cell coexpresses both Cx26 and Cx32 Traub et al., 1989).
Currently there is little known about the genes that code for the connexins. The gene coding for one connexin, rat Cx32, has been isolated and described (Miller et al., 1988). It is a single copy gene with two exons, of which the second contains the entire coding region for the protein. The genes coding for Cx26 and Cx43 are also single copy (Musil et al., The abbreviations used are: all connexins are abbreviated CX followed by their predicted molecular mass. When appropriate, the species from which the connexin was identified is designated by the prefixes C, H, R, or X for chicken, human, rat, and Xenopus, respectively. Divergence times are given in million years (MYr) or billion years (BYr). Every 1000 bases is a kilobase (kb). DNA melting temperature is designated T,,,. Zhang and Nicholson, 1989). The gene coding for Cx26 appears to have at least two exons, one of which contains the entire coding region as determined by Southern blotting (Zhang and Nicholson, 1989). All connexins cloned to date were isolated first from cDNA libraries. The current study uses instead low stringency screening of a rat genomic library to isolate genes that code for connexin homologues to understand further the diversity, distribution, and phylogeny of this family of proteins. Using this alternate strategy we have identified a new member of the connexin gene family designated Cx31. Characterization of the gene reveals a 270-amino acid open reading frame with a high degree of sequence similarity to other connexins. This connexin shares many features with previously identified members of the family, including an alignment and conservation of three potential phosphorylation sites at the extreme carboxyl terminus of Cx43. We have also produced the first phylogenetic tree for the known connexins, which shows that Cx31 is closely related to Cx26 and Cx32 and distantly related t o cx43.

EXPERIMENTAL PROCEDURES
Materials-Restriction endonucleases and T 7 RNA polymerase were obtained from Boehringer Mannheim. Modified T 7 DNA polymerase and sequencing reagents were purchased from U. S. Biochemical Corp. Chemicals were from Sigma or Boehringer Mannheim. Sprague-Dawley rats were supplied by Simonsen Laboratories (Gilroy, CA).
Isolation of Connenin Homologue Gene-To identify connexin homologues we screened a Sprague-Dawley rat genomic EcoRI partial digest Charon 4A library using the Cx32 cDNA as a probe under low stringency conditions (Paul, 1986;Sargent et al., 1979). Briefly, the Cx32 cDNA in pGEM-3 (Promega Biotec, Madison, WI) was linearized in the polylinker on the 5' side of the insert with BamHI, and a n antisense RNA probe with specific activity 2.5 pCi/ng was transcribed using T 7 RNA polymerase and [a-"PIUTP (Melton et al., 1984;Tabor and Richardson, 1985). The RNA probe was used to screen approximately five genome equivalents (1 X IO6 clones) of the genomic library plated at 5 X lo4 plaques per 150-mm plate and replicated in duplicate onto Hybond-N (Amersham Corp.) membranes. Prehybridization and hybridization were carried out in 5 X SSPE (1 X SSPE = 150 mM NaCI, 10 mM Na,H(PO,), and 0.1 mM EDTA, pH 7.2), 5 X Denhardt's solution (50 X Denhardt's = 1% w/ v bovine serum albumin, 1% w/v polyvinylpyrrolidone, and 1% w/v Ficoll), 30% deionized formamide, 20 pg/ml poly(A) RNA, 30 pg/ml yeast tRNA, and 0.5-2 ng/ml RNA probe at 45 "C for 6 and 36 h, respectively. Membranes were treated a t increasing stringency to a final wash of 1 X SSPE at 60 "C for 4 h and subjected to autoradiography. Positive clones were identified, and the procedure was repeated until clonal purity was achieved. The genomic clones isolated were characterized further by restriction site analysis and Southern blotting using the rat Cx26, Cx32, and Cx43 cDNAs as probes (Beyer et al., 1987;Paul, 1986;Zhang and Nicholson, 1989). A 4.4-kb EcoRI fragment from one h clone, RGJ21, which cross-hybridized with all three probes, was subcloned into pBluescript I1 KS(+) (Stratagene Cloning Systems, La Jolla, CA), characterized in more detail, and both strands of a 1.1-kb EcoRI-Sac11 fragment were sequenced by standard dideoxy sequencing (Sanger et al., 1977;Tabor and Richardson, 1987).
Genomic Southern Blots-Southern blots of genomic DNA from rat (Sprague-Dawley), mouse (Balb/c), pig, and frog (Xenopus laeuis) were performed under high and moderate stringency conditions. The rat DNA was isolated as described by Strauss (Strauss, 1988), the mouse and pig DNA were obtained commercially (Clontech, Palo Alto, CA), and the Xenopus DNA was provided by R. Wagner (California Institute of Technology). The rat DNA was digested separately with BglI, EcoRI, HindIII, KpnI, NheI, Sac11 or XbaI, separated on a 0.8% agarose gel, and capillary transferred onto Hybond-N as described by Maniatis et al. (1982). The blot was probed with the Cx31 EcoRI-Sac11 fragment random primer labeled to a specific activity of 1xlO9 cpm/pg with [a-32P]dATP (Feinberg and Vogelstein, 1984). High stringency prehybridization and hybridization were carried out in 5 X SSPE, 5 X Denhardt's solution, 50% deionized formamide, 1 mM sodium pyrophosphate, 1 mM ATP, 0.1% sodium dodecyl sulfate, 20 pg/ml salmon sperm DNA, 30 pg/ml yeast tRNA, and 2-5 ng/ml probe DNA a t 45 "C for 24 and 36 h, respectively. Blots were washed at increasing stringency to a final 0.1 X SSPE at 68 "C for 4 h. For the moderate stringency "zoo" blot all DNAs were digested with EcoRI or HindIII and treated as above, except that the hybridization conditions were 5 X SSPE, 5 X Denhardt's, 1 mM sodium pyrophosphate, 1 mM ATP, 0.1% sodium dodecyl sulfate, 20 pg/ml salmon sperm DNA, 30 pg/ml yeast tRNA, and 2-5 ng/ml probe DNA at 60 "C. The zoo blots were washed a t increasing stringency to a final 1 X SSPE at 65 "C for 2-10 h. T, values were estimated by the method of Meinkoth (Meinkoth and Wahl, 1984). All fragments were sized using the GEL' regression program.
Northern Blots of Organ RNA-To determine the expression pattern of RCx31 we isolated total RNA from several tissues and organs by the guanidinium thiocyanate method or the modified guanidinium thiocyanate method that includes a centrifugation through CsCl (MacDonald et al., 1987). The sources of RNA were liver, heart, tail skin including the dermis, tail connective tissue (everything left after the skin was removed), Harderian gland, eye, placenta from a 19-day pregnant animal, epididymal fat pads, brain without the cerebellum, blood, stomach, femur including marrow, pancreas, spleen, ovary, uterus, thigh skeletal muscle, lung, duodenum, testis, and kidney, all from both male and female Sprague-Dawley rats. Northern blots were performed by electrophoresing 10-20 mg of each RNA in a formaldehyde gel. The separated RNA was capillary transferred onto Hybond-N and probed under conditions identical with the high stringency Southern blots except that the temperature was increased to 48-50 "C .
Analysis of RCn31 Sequence-All computer analyses were carried out with either PC-Gene version 6.01 or 6.25 (PCG) running on an Epson Equity I1 or the University of Wisconsin Genetics Computer Group package version 6.2 (GCG) (Devereux et al., 1984) running on a VAX Station 3100 (Model m38) unless otherwise specified. All parameters are default unless otherwise indicated. Kyte and Doolittle window in SOAP (PCG) (Kyte and Doolittle, 1982). The RCx31 hydropathy analysis was carried out on RCx31 using a 15-amino acid sequence was scanned for known consensus sites for post-translational protein modification with PROSITE (PCG). Version 3.0 of PROSITE detects possible glycosylation, phosphorylation, sulfonation, amidation, fatty acylation, hydroxylation, carboxylation, phosphopantetheine attachment, and farnesyl group binding sites.
Multiple Sequence Alignment-Multiple alignments of the new rat Cx31 with protein sequences for rat Cx43, Cx32, and Cx26; chicken Cx43; Xenopus Cx43, Cx30, and Cx38; and human Cx32 were generated three sequences at a time using the program ALP3.3 The ALP3 algorithm has been described by Murata et al. (1985). The triple alignments were compiled and adjusted by eye, taking into account the predicted phosphorylation sites, using the program LINEUP (GCG).
Construction of Phylogenetic Tree-To produce a phylogenetic tree and and estimate divergence times, the connexins were divided into two homology domains; the first domain was from amino acid 2 to 99 in the Cx31, the second domain corresponded to positions 125-153, 155-165, and 168-211 in Cx31 and the aligned domains in the other connexins (see Fig. 6). Phylogenetic trees where generated using the program CLUSTAL (PCG) (Higgins and Sharp, 1988) for both domains I and I1 using rat Cx26, Cx31, Cx32, Cx43, and Xenopus Cx38. This program generates all possible pair-wise alignments using the algorithm of Wilbur and Lipman (1983). It then generates dendrograms using the unweighted pair group maximum averages method of Sneath and Sokal(1973). Divergence rates for Cx32 and Cx43 were estimated by linear regression analysis of plots of the percent corrected divergence for RCx32, HCx32, and XCx30; and RCx43, CCx43, and XCx43 uersus divergence times using the method of Perler (Perler et al., 1980). The times used are human/rodent 75 MYr, mammal/ bird 275 MYr, and mammal/amphibian 350 MYr (Dayhoff, 1972;Doolittle et al., 1989;Perler et al., 1980).

Isolation of a New Connexin
Gene-Thirty genomic clones that cross-hybridized to the RCx32 cDNA probe were isolated. Southern blotting and restriction site analysis of these clones * GEL is a public domain program available from Intelligenetics, ALP3 is a public domain program available from Intelligenetics, Mountain View, CA.
Mountain View, CA. and Characterization of Cx31 revealed that six corresponded to two different polymorphisms of the RCx32 gene! A single clone designated RGJ21 that cross-hybridized with RCx26, RCx32, and RCx43 was identified and restriction-mapped (Fig. 1). The remaining clones are currently under further investigation, but three others that have been examined closely reveal no connexin homologues. The 4.4-kb fragment containing the homologous sequence was subcloned, and 982 bases between the EcoRI

FIG. 2. Nucleotide sequence and predicted protein sequence for
Cx31. Predicted protein kinase C phospredicted by PROSITE are marked P phorylation and casein kinase I1 sites as and C, respectively. and SacII sites were sequenced. Fig. 2 shows the nucleotide sequence and the translation of the nucleotide sequence that revealed a 270-amino acid open reading frame coding for a protein with a predicted molecular mass of 30,960 Da. The predicted protein exhibits a high degree of similarity to previously characterized connexins, 58, 50, and 40% amino acid identity, and 65, 58, and 51% nucleotide identity to RCx26, RCx32, and RCx43, respectively. Based on its homology to other connexins and predicted characteristics we designate this protein rat connexin-31 (RCx31).
Genomic Southern Blots-High stringency Southern blots of rat genomic DNA digested with seven different enzymes probed with Cx31 all revealed only a single band, including the 1.6-kb KpnI and 4.4-kb EcoRI fragments predicted from the genomic clone (Fig. 3). To control for cross-hybridization the same blot was hybridized with the RCx32 cDNA. It produced no bands identical with the Cx31 blot. This would suggest that RCx31 is a single copy gene. Southern blots of rat, mouse, pig, and frog genomic DNA were probed with Cx31 under conditions that allow sequences of greater than 75% nucleotide identity to be detected based on the estimated T, (Fig. 4). Single EcoRI bands of 4.4, 4.1,1.0, and 3.2 kb are seen in the rat, mouse, pig, and frog DNAs, respectively. Single HindIII bands of 3.0, 18, and 4.1 kb are also seen in the rat, mouse, and frog DNAs, respectively. In addition, a strong 3.3-kb HindIII band and three larger but weaker bands are seen in the pig DNA. This suggests that Cx31 is highly conserved and present in these species.
Distribution Only a single band is seen in all lanes suggesting that Cx31 is a single copy gene. Hybridization of the same blot with a rat Cx32 probe revealed no bands identical with Cx31. isolated from 20 different rat organs was probed with Cx31. It shows the presence of a 1.7-kb mRNA in placenta, Harderian gland, skin, and eye (Fig. 5). Some other ergans show weak hybridization, but these cannot be described with confidence as positive signals. Control blots probed with RCx26, RCx32, and RCx43 show hybridization to 2.5-, 1.7-, and 3.0kb mRNAs, respectively, in organs known to produce these transcripts, but no cross-hybridization with the RCx31 mRNA was detected. In addition to the organs known to express Cx26 and Cx43, we find a strong signal for both Cx26 and Cx43 in skin, and we detect Cx43 in bone, whole blood, and epididymal fat pads (data not shown). The control blots for Cx26 and Cx43 did not have RNA from placenta on them.
Analysis of RCx31 Protein Sequence-Hydropathy analysis of the RCx31 protein sequence reveals four highly hydrophobic amino acid segments similar to other connexins. In Cx32 and Cx43 these segments have been shown to be transmembrane spanning (Yancey et al., 1989;Milks etal., 1988;Zimmer et al., 1987). Analysis of the protein for post-translational consensus modification sequences reveals potential phosphorylation and amidation sites. The protein kinase C consensus (S/T)-X-(RK) (Kishimoto et al., 1985;Woodget et al., 1986) is present at amino acid positions 182, 223, 229, 233, and 238 (Fig. 2).
Position 182 is in the predicted second extracellular loop and would likely not be available to a cytoplasmic kinase. The other sites are all predicted to be exposed in the cytoplasm, in the carboxyl-terminal region. A single potential casein protein kinase I1 site is predicted at position 202, which is in the middle of the putative fourth transmembrane helix (Kuenzel and Mulligan, 1987). An amidation consensus site X-G-(RK)-(RK) is present at position 120 (Kreil, 1984). Cx31. Genomic DNA from rat, mouse, pig, and frog was digested with EcoRI and Hind111 and probed at moderate stringency with a random primed Cx31 EcoRI-Sac11 fragment. The figure is a composite from two different washes of the same blot. The rat and mouse lanes had excessive background after the first wash, and the pig and frog lanes were too weak to permit adequate reproduction after the second wash. All washes were done in 1 X SSPE at 65 "C.
Multiple Alignment-Multiple alignment of RCx31 with eight other connexins was carried out ( Fig. 6; positions in the multiple alignment are referred to with the prefix MA). It includes all connexin homologues identified for which the full protein sequence is available, except for the bovine Cx43 recently published (Lash et al., 1990). The alignment shows a perfect conservation of the three cysteines in the first extracellular loop, but RCx31 has a single amino acid inserted between the first two cysteines in the second extracellular loop. The cysteines have been shown to form at least one disulfide bond between the extracellular loops in C~3 2~ and Cx43.6 RCx31 has a 22-amino acid deletion relative to the Cx43's cytoplasmic loop, leaving it with the smallest cytoplasmic loop of all the connexins. As is the case for Cx43s and XCx38, Cx31 has an arginine at MA162 in the third highly amphipathic putative transmembrane segment. The carboxyl-terminal regions of the different connexins are diverged highly. To improve the alignment of RCx31 over this region, we have taken into consideration the predicted posttranslational modification sites. We find three predicted protein kinase C phosphorylation sites conserved when RCx31 and RCx43 are compared. These sites are at positions MA367, MA371, and MA376 in the multiple alignment and characterized by the sequence SSRAS. The third predicted phosphorylation site has a threonine substituted for a serine and is offset one amino acid from the corresponding Cx43 site that is at MA375. In addition the alignment shows two other sequence similarities among the other proteins that have not been B. Nicholson, personal communication. S. A. John, unpublished observations. and Characterization of Cx31 FIG. 5. Northern blot of total RNA isolated from 20 different rat organs probed with Cx31. RNAs were separated on a formaldehyde agarose gel and probed at high stringency with a random primed Cx31 EcoRI-Sac11 fragment. The 1.7-kb mRNA band is detected in skin, Harderian gland, eye, and placenta. Control blots probed with Cx26, Cx32, and Cx43 did not cross-react with any Cx31 bands. The Cx32 mRNA comigrated with the Cx31 mRNA when the above blot was reprobed. However, Cx31 and Cx32 were not detected in any of the same tissues. Control blots did reveal the presence of Cx26 in skin, and of Cx43 in skin, bone, whole blood, and epididymal fat pads. described previously. All connexins except RCx31 and RCx26 share a 11-13-amino acid stretch with four to six identities beginning a t position MA319 in the multiple alignment characterized by a QNXGS sequence. Also, the carboxyl termini of Cx32 and Cx43 show a weak alignment across the SSRAS segment.
Phylogenetic Tree of Connexins-Using the program CLUS-TAL, a tree of the five unique connexins identified to date, for which nucleotide sequences are available, was constructed (Fig. 7). The tree has two major branches, one with Cx38 and Cx43, and a second with Cx26, Cx31, and Cx32. Based on estimated divergence rates for Cx32 and Cx43, the two branches these molecules represent diverged 1.3-1.9 BYr ago.

DISCUSSION
All previously identified connexins were isolated as cDNAs from organ-or tissue-specific libraries. Because it has become clear that the connexins form a moderately large gene family and that the various connexins exhibit tissue specificity, we decided to use an alternate approach to isolate new connexin homologues by screening a rat genomic library a t low stringency. This more general approach will identify new connexins irrespective of the tissue in which they are expressed. In this way we have identified so far one new connexin, rat Cx31, that we describe here. The gene for Cx32 was isolated six times from the five genome equivalents we screened, but only a single Cx31 gene was found. In addition, the low stringency screening did not identify genes for either Cx43 or Cx26. There are several possible explanations for these anomalies. The library we used was amplified, which is known to cause biases in representation. In addition, the library was generated from an EcoRI partial digest, which obviously will result in some parts of the genome not being represented because the EcoRI fragments they generate are too large for the vector chosen.
The isolation of Cx31 now extends the number of connexins identified in rat to five, including Cx26, Cx32, Cx43, and Cx46. The nucleotide sequences for all of these, except Cx46, have been described. The number of genes isolated that code for connexins is now two, rat (2x31 and Cx32. A significant issue with respect to Cx31 is whether the entire coding region for the protein for the gene is represented in the sequence from the single exon we describe. The answer to this cannot be known with certainty until a cDNA for Cx31 has been isolated and characterized. However, we believe that the coding region sequence is complete for the following reasons. The connexin genes coding for Cx26 and Cx32, for which there is sufficient structural detail, and that appear to be related closely to Cx31, both contain the entire coding region in a single exon. The mRNA size of 1.7 kb, although large enough to accommodate a larger coding region, is the same size as the Cx32 protein that is similar in size. Finally, there are no predicted exonlintron borders in Cx31 between the sequence coding for the fourth transmembrane helix and the carboxyl terminus (data not shown).
As in the case of the genes coding for rat Cx26, Cx32, and Cx43 (Miller et al., 1988;Musil et al., 1990;Zhang and Nicholson, 1989), the gene coding for Cx31 is a single copy gene. The fact that it shares no bands with Cx32 on a genomic Southern blot suggests that the Cx31 gene is not located near the Cx32 gene. Despite this, Cx31 and Cx32 may be part of a gene cluster with other connexin~.~ Further understanding of the Cx31 gene structure must await the isolation of a cDNA. By Southern blotting the Cx31 gene appears to be highly conserved, and a hybridization signal is seen in an organism as diverged from rat as the amphibian X. lueuis. Of course it is possible that the signal seen in Xenopus with the Cx31 probe is a connexin isoform. However, all the connexins, Cx32 and Cx43, which have been isolated from multiple species are much more conserved between species than between isoforms. Further comparison of Cx32 and Cx43 sequences demonstrates that the amino-terminal region, corresponding to the first 99 amino acids, of these molecules is particularly well conserved and changes a t a rate similar to the slowly evolving cytochrome c and glyceraldehyde-3-phosphate dehydrogenase (Dayhoff, 1972).
Cx31 has a unique distribution being found in skin, eye, Harderian gland, and placenta. It is found in the skin with Cx43 and Cx26. The skin is a complex tissue with many cell types including fibrobasts that have been shown to express Cx43 Crow et al., 1990). Intercellular communication in the epidermis has been studied in some detail (Kam et dl., 1986), but the exact localization of the different connexin molecules, their relationship to each other, and physiological roles have yet to be examined. Cx31 is found with Cx43 and MP70 (Cx46?) in the eye, where Cx43 is localized to the fibroblasts in the cornea and the cells of the lens epithelium  and MP70 is localized to the lens fiber cell (Kistler et al., 1988).
To date Cx31 is the only connexin found in the placenta. Gap junctions have been described in the mature rat placenta in all three layers of the labyrinth. The barrier between maternal and fetal blood supplies is at the boundary between layers I1 and 111. Tracers applied through the maternal blood supply easily pass through layer I and accumulate at the border between I1 and 111. These layers have been shown to contain gap junctions, and it has been proposed that these J. H. Hoh, unpublished observations.

Molecular Cloning and Characterization
of Cx31 (RCx43), Xenopus connexin-30 (XCX~O), -38 (XCx38), and -43 (XCx43), chicken connexin-43 (CCx43), and human connexin-32 (HCx32). Consensus amino acids are shown in upper case letters and differences are shown in bold lower case letters. The criterion for a consensus is the identity of half, or four or more of the sequences at a given position. Alignments were carried out three at a time using the program ALP3, then compiled using LINEUP, and adjusted by eye. It shows the characteristic highly similar sequence for the first 230 amino acids except for a deletion of varying lengths with respect to all Cx43s at positions MA126-MA149. Further it reveals a conserved segment in several but not all connexins at MA320-MA333. When the alignment was adjusted further to take into account predicted phosphorylation sites, three conserved putative protein kinase C sites, marked with *, were discovered in the RCx31 and RCx43 sequences. RCx32 also has a weak similarity across this region.

Cx43
Cx38 Cx31 Cx32 Cx26   FIG. 7. Phylogenetic tree of connexins. The tree is based on the CLUSTAL analysis of domains I and I1 and of the protein sequences for rat Cx26, cx31, Cx32, and Cx43, and Xenopus Cx38. The estimated time for the divergence of the two major branches is 1.3-1.9 BYr ago. gap junctions are responsible for the exchange of fluids, gases, and small metabolites between the mother and fetus in rat (Metz et al., 1978).
The Harderian gland is a secretory organ located on the posterior aspect of the eye in animals with a third eye lid. The biological function of the gland is not well understood. In mammals it has been shown to consist of two major cell types, secretory cells and myoepithelial cells (Woodhouse and Rhodin, 1963). In the hamster, both these cell types are innervated, and it has been suggested that the secretory functioning of the gland is regulated neuronally (Bucana and Nadakavukaren, 1972). A neuronal signal that activates secretion might be transmitted electrically or biochemically between cells through gap junctions. Further study of Cx31, which is the only connexin yet identified in the Harderian gland, may shed light on this process.
Analysis of the Cx31 protein sequence reveals several potential protein modification sites. The single potential amidation site is probably not utilized, because amidation usually is associated with processed proteins such as neuropeptides (Kreil, 1984). There are five predicted protein kinase C sites. The first, at position 182, is in the predicted second extracellular domain and probably not available to a kinase. The same is true for the single casein protein kinase I1 site at position 202, which is in the middle of the predicted fourth transmembrane helix. The remaining four protein kinase C sites are in the carboxyl-terminal region. Whether these sites are utilized must await biochemical analysis of the protein. It is interesting to note that protein kinase C and CAMP-dependent protein kinase have been implicated in regulation of junctional communication in a number of systems (Murray and Gainer, 1989). Cx32 has been shown to be phosphorylated by CAMPdependent protein kinase in vivo and in vitro (Saez et al., 1986;Takeda et al., 1987). In addition, Cx43 in fibroblasts is phosphorylated (Crow et al., 1990).
The multiple alignments (Fig. 6) show the general characteristics of the gene family that have been described in previous comparisons Gimlich et al., 1990).
The molecules have a 200-220 amino acid core which is highly conserved, except for the intracellular loop. The loop is of variable length but highly conserved between homologous connexins from different species. The high degree of sequence similarity through the first 200-220 amino acids is particularly interesting considering the high degree of morphological conservation of the gap junction observed. Two notable features in this region are the conserved cysteines and the putative amphipathic helix. The two sets of cysteines have the pattern CX5CX3C in the first extracellular loop and CX3CX5C in the second extracellular loop. The only exception is in Cx31 where the second set of cysteines has a single amino acid inserted which results in the pattern CX4CX5C. The putative amphipathic helix, which has been proposed to line the pore of the channel, begins at position MA154 and has the conserved motif TX3SX3K/RX3E. The carboxyl termini are highly diverged and vary in length from approximately 10 to 150 amino acids and may provide regulatory specificity to the various connexin isoforms. Several aspects of the alignment we show here are different from ones published previously. We find a unique alignment in the carboxyl-terminal region characterized by the sequence QNXsS or QNX9S, starting at position MA320, conserved in all connexins except RCx31 and RCx26. The significance of this sequence is not known, and it does not match any known structural or functional motif as determined by PROSITE. Its conservation in many connexins would suggest that it may be involved in a shared structure or function. Further study of RCx31 and its physiology may reveal more about the sequence, since it is not present in that molecule. The approach of using predicted phosphorylation sites to adjust the alignment of the sequences revealed a segment characterized by the sequence SSRAS, starting position at MA371, that is shared between RCx31 and RCx43. There is also a weak alignment with RCx32, which contains a predicted protein kinase C site at position MA377. The SSRAS segment has several putative protein kinase C phosphorylation sites in RCx31 or RCx43, but no actual positions of protein kinase C phosphorylation in any of these molecules has been reported. RCx32 has been shown to be phosphorylated in vitro by protein kinase C (Takeda et al., 1989).
The phylogenetic tree of connexins generated from analysis of the divergence rates, and the CLUSTAL program, predicts a branch between Cx43/Cx38 and Cx26/Cx31/Cx32 at 1.3-1.9 BYr. That estimated divergence time assumes that the amino acid replacement method of Perler is an accurate clock (Perler et al., 1980). It is relevant to note that the prokaryote/ eukaryote divergence is estimated at 1.8 BYr, the plant/ animal at 1.0 BYr, and the vertebrate/invertebrate at 0.6 BYr (Doolittle et al., 1989). It is also interesting to note that the phylogenetic tree of connexins is consistent with the nomenclature proposed for the gap junction protein family by Gimlich et al. (1990). That nomenclature is based on unspecified sequence similarities between connexins and uses greek letters to identify homologous proteins. Eventually it will be useful to have a naming system independent of calculated molecular weight, because a name based on the size of the molecule will become useless once the repertoire of connexin isoforms and connexins from different species becomes too large. We suggest that phylogenetic relatedness would be a suitable criterion for naming connexins, but such a system must await further definition and analysis of the gene family. Until that time the current system of using the prefix Cx followed by the molecular mass is a practical approach.