A New Member of the Glutamine-rich Protein Gene Family Is Characterized by the Absence of Internal Repeats and the Androgen Control of Its Expression in the Submandibular Gland of Rats*

A cDNA, corresponding to a rat submandibular mRNA which is accumulated at a 20-fold higher level in males than females, has been isolated. The predicted protein, SMR2, has a calculated molecular mass of 15.4 kDa and is rich in glutamine/glutamic acid, proline, and asparagine/aspartic acid, a characteristic of the so-called salivary glutamine-rich proteins (GRPs) of the submandibular gland of rats. Nucleotide sequence comparisons indeed revealed strong similarities between the sequences of the SMR2 mRNA and that of GRPs, except in the region encoding the carboxyl-terminal part of the proteins. In particular, the SMR2 mRNA contains the 5'-untranslated region and the signal peptide region shared by both groups of GRPs and proline-rich proteins (PRPs). A major difference is that, in SMR2, the peptidic motif which is repeated four or five times in GRPs, is only found once. The SMR2 gene is about 3.5 kilobases in length and contains 4 exons. The second intron, which does not exist in characterized GRP genes, splits the "transition" region which separates the repetitive sequences from the signal peptide. This structure is reminiscent of that found in most PRP genes, strengthening the hypothesis that GRP and PRP genes have the same ancestral origin.

A cDNA, corresponding to a rat submandibular mRNA which is accumulated at a 20-fold higher level in males than females, has been isolated.
The predicted protein, SMRB, has a calculated molecular mass of 15.4 kDa and is rich in glutamine/glutamic acid, proline, and asparagine/aspartic acid, a characteristic of the so-called salivary glutamine-rich proteins (GRPs) of the submandibular gland of rats. Nucleotide sequence comparisons indeed revealed strong similarities between the sequences of the SMR2 mRNA and that of GRPs, except in the region encoding the carboxylterminal part of the proteins. In particular, the SMR2 mRNA contains the 5'-untranslated region and the signal peptide region shared by both groups of GRPs and proline-rich proteins (PRPs). A major difference is that, in SMRS, the peptidic motif which is repeated four or five times in GRPs, is only found once. The SMR2 gene is about 3.5 kilobases in length and contains 4 exons. The second intron, which does not exist in characterized GRP genes, splits the "transition" region which separates the repetitive sequences from the signal peptide. This structure is reminiscent of that found in most PRP genes, strengthening the hypothesis that GRP and PRP genes have the same ancestral origin.
Two different functions are frequently assigned to the submandibular gland (SMG)' of rodents. One is an exocrine function consisting in the constitution of the salivary fluids. The other is an endocrine function which leads to the release into the blood of certain growth factors and hormones.
Exocrine secretions mainly involve the acinar cells of the SMG. Among the secreted proteins are families of tissuespecific proteins characterized by highly repetitive contiguous peptide sequences. According to their predominant amino acids, these proteins have been classified into proline-rich kb, kilobase( proteins (PRPs) or glutamine-rich proteins (GRPs). Their role in the salivary fluids has not been elucidated. On account of their high affinity for calcium phosphate (1,2), it has been proposed that they could be involved in the protective proteinaceous structure of teeth surfaces. In addition, a role in the detoxification of certain substances, such as tannins, has been postulated for PRPs (3,4).
In addition, the structure of some PRP and GRP genes (8-12) has been established.
The peptide sequence of GRPs and PRPs can similarly be divided into four regions: a signal peptide, a "transition" region (which separates the repetitive region from the signal peptide), a repetitive region, and a carboxyl-terminal region. The organization of GRP and PRP genes is very similar and, in particular, the sequence of the first exon (corresponding to 5'-untranslated region and signal peptide) is highly conserved among these genes (8,9,11,12). This suggests that GRP and PRP genes may derive from a common ancestor.
PRPs are encoded by a multigenic family mapped on chromosome 12 in man (13). In mouse, the PRP genes were firstly assigned to chromosome 8 (14) on the basis of results with mouse x hamster somatic cell hybrids but new linkage data indicate that they are on chromosome 6 (15). Evolutionary models for this gene family include a series of internal duplications of a 42-bp unit (9). Diversity would have been generated by recruitment or deletion of three bases from the ancestral unit during the duplication events, leading to a final length which varies between 42 and 63 bp. Finally, gene conversion would have homogenized the divergence between the internal repeats. GRP genes differ from PRP genes, in particular, by the length (69 bp) and the sequence of the repeats. They are also part of a multigenic family; more than 10 GRP genes have been detected by Southern blot analysis in rats (8). The sequences of the two characterized GRP mRNAs are identical except for the number of repetitive motifs and the carboxylterminal part of the proteins, probably due to recent gene conversion events (8).
We are interested in the androgen regulation of genes expressed in the SMG of rodents. A number of growth factors, hormones, and other proteins with biologically defined properties are synthesized in large amounts in the SMG of rodents under androgen control (16,17). It is the case, for instance, for renin, epidermal growth factor, and nerve growth factor in the SMG of mice. The role of these peptides in the saliva is unclear. Curiously, the pattern of proteins expressed at a higher level in the SMG of males than females appears to be Evolutionary Aspects of the Glutamine-rich Protein Family species-specific. In an attempt to characterize some of the peptides whose expression is regulated by androgens in the SMG of rats, we have compared the patterns of in vitro translation products directed by the SMG mRNAs prepared from males and females (17). We have shown that several polypeptides are translated in higher amounts from male than female SMG mRNAs. One of them, SMRl, was shown to have the structure of a hormonal precursor, which could potentially give rise to a thyrotropin releasing hormone-like peptide after processing (18).
Here we report the characterization of another SMG mRNA, accumulated under androgen control. This mRNA encodes a protein, SMR2, which is related to the GRPs but contains only once the repetitive unit present in GRPs. The structure of the corresponding gene has been studied and is very similar to that of GRP except for the presence of an additional intron in the transition region coding sequence, reminiscent of that found in most PRP genes. EXPERIMENTAL PROCEDURES AND RESULTS'

DISCUSSION
In this paper, we report the characterization of a gene which is expressed under androgen control in the SMG of rats. The product of this gene, SMR2, belongs to the family of salivary glutamine-rich proteins. The nucleotidic sequence of the SMR2 mRNA is about 75% homologous to that of GRP mRNAs (except in the regions encoding the carboxyl-terminal part of the proteins). A similar intron-exon structure is found in GRP and SMRP genes.
Like the GRPs, SMR2 contains a relatively high proportion of glutamine + glutamic acid (20%), proline (12%), and asparagine + aspartic acid (13%). In addition, SMR2 and GRPs share a certain number of structural properties. They display an overall negative charge and a similar distribution of charges along the sequence with an excess of negative charges in the central part of the proteins and an excess of positive charges in the carboxyl-terminal region. Analysis of SMRP and GRP primary structure by the method of Hopp and Woods (32) (data not shown and Ref. 2) reveals that they are hydrophilic, except in the amino-terminal (signal peptide) and in the carboxyl-terminal regions which are more hydrophobic. Curiously, both GRPs and SMR2 have the same aberrant behavior on NaDodSO,-PAGE. Their predicted molecular masses (including signal peptide) are, respectively, 26.5 kDa and 15.4 kDa while the molecular masses determined from the electrophoretic mobility of the in vitro translation products are about 70 kDa (2) and 35 kDa. By analogy with the aberrant mobility observed for collagen (33), PRP (34), and chromogranins (35), Mirels et al. (2) have suggested that GRP aberrant behavior may be due to the high proline content and to the anionic net charge.
However, a major difference between SMRP and GRPs concerns the number of internal repeats. The peptidic motif which is found perfectly repeated four or five times in GRPs only occurs once in SMR2. One hypothesis is that the duplication leading to the separation of the GRP and SMR2 genes could have occurred before the process of internal duplication in GRPs. Alternatively, most of the repeats could have been deleted in SMR2 gene (for instance, during an unequal crossing-over event). The structure of the SMR2 gene is very similar to that described for the GRP genes (8). In both cases, there is one intron at the end of the signal peptide sequence and one immediately downstream from the stop codon. However, in addition to the two introns found in GRP genes, the SMR2 gene contains a third intron splitting the transition region. Interestingly, the other family of salivary contiguous repeat proteins (proline-rich proteins) shares the same gene organization and can also be divided into two groups on the basis of the presence or absence of a third intron inside the transition region. The human PRHl and PRH, genes (10) and the hamster HZ9 gene (12) have the same structure as the SMR2 gene in that exon II is divided into two parts (exon IIa and exon IIb) by an intron in the transition region (Fig. 6). On the other hand, the mouse Ml4 and MP, genes (9), as GRP genes, contain a unique exon (exon II) encoding the entire transition region, the repetitive segments, and the carboxylterminal region. However, Ann et al. (11) have proposed that combination of exon IIa and IIb in the mouse PRP genes could represent a species-specific difference in PRP gene structure.
Curiously, in the region where intron II occurs in SMR2 gene, GRP mRNAs contain a 15-bp sequence which does not align with the SMR2 mRNA sequence (see dot matrix of homology on Fig. 4). In an attempt to optimize these alignments, we found that the sequence present in GRP mRNA and absent in SMR2 mRNA is highly homologous to the 3' end of the SMR2 intron II. As shown on Fig. 7, one possibility is that this sequence was originally a part of the exonic sequence. We propose that, after insertion of intron II into a GRP-like gene, this sequence has been released inside intron II by the use of a new more 3' splicing site, leading to the present structure of the SMR2 gene. Since the sequences of PRP genes have too much diverged in this region, they cannot """an PRP PRH, be used to verify the validity of such a model. One should therefore also consider the possibility that this region of the ancestral gene had the same structure as in the SMR2 gene and that the GRP structure was created by the imprecise excision of the intron II.
Exons I of SMRP and GRP genes, are 92% homologous and are also highly homologous to that of PRP genes (Fig. 8). The potential significance of this surprising conservation of exon I in PRPs has been discussed by Ann et al. (9). This sequence could be critical for the synthesis and/or secretion of these proteins.
It has been proposed that the transition region and the carboxyl-terminal region of the PRPs could have emerged, like the repeats, during the internal duplication events (9). A surprising finding is that SMRP mRNA is about 75% homologous to GRPs mRNAs (including the transition region and the 3'-untranslated region) except in the carboxyl-terminal region, where no significant homology can be found. A similar observation has also been reported for the two characterized GRP mRNAs. The sequences of these mRNAs are identical except in the carboxyl-terminal regions which are only about 60% homologous in terms of nucleotidic sequences. Heinrich and Habener (8) have therefore proposed that the two genes have undergone multiple conversion events in the recent evolutionary past.
Since more than 10 GRP genes have been detected by Southern blot analysis of rat DNA (8), we cannot exclude that some other GRP genes share the same carboxylterminal region as SMRB. The high level of sequence divergence in the carboxyl-terminal region, together with differences in repeat number, could contribute to give functional specificity to the different GRPs and SMRB.
Although SMR2 and GRP genes obviously have a common evolutionary origin, their expression is submitted to different regulation.
No sexual dimorphism has been reported for GRP gene expression. In contrast, SMR2 mRNA accumulation is about 20 times higher in the SMG of male than female rats. Whether the androgen regulation is transcriptional or not, direct or indirect is under investigation.
However, since the 5'-and 3'-untranslated regions which are the most often involved in mRNA stability (36) are relatively well conserved between SMRP and GRP genes, a mechanism of mRNA stabilization by androgens seems rather improbable. Due to the high level of sequence homology between the GRP and SMR2 genes, this family could therefore provide an interesting model to study the mechanisms involved in differential regulation of the genes expressed in the SMG, and particularly in the regulation by androgens.
Acknoruledgments-We especially thank Dr. M. Tosi for helpful discussions.
We also thank C.