Osteogenic Protein-2 A NEW MEMBER OF THE TRANSFORMING GROWTH FACTOR-@ SUPERFAMILY EXPRESSED EARLY IN EMBRYOGENESIS*

Osteogenic protein-2, OP-2, a new member of the transforming growth factor-@ (TGF-@) superfamily, closely related to the osteogenic/bone morphogenetic proteins, was discovered in mouse embryo and human hippocampus cDNA libraries. The TGF-@ domain of OP-2 shows 74% identity to OP-1, 75% to Vgr-1, and 76% to BMP-5, hence OP-2 may also have bone induc- tive activity. The genomic locus of OP-2 has seven exons, like OP-1, and spans more than 27 kilobases (kb). In the C-terminal TGF-@ domain, OP-2 has a unique additional cysteine. Mouse embryos express rel- atively high levels of OP-2 mRNA at 8 days, two species of 3 and 5 kb. A careful study of mRNA expression of the osteogenic proteins in specific organs revealed discrete mRNA species for BMP-3, BMP-4, BMP-5, and BMP-6/Vgr-1 in lung or liver of young and adult mice. OP-1 is expressed in kidney; however, OP-2 and BMP-2 mRNAs were not detected in any organs stud- ied, suggesting an early developmental role.

residues including a pattern of 7 cysteines.
In their biological roles, these proteins act during embryogenesis and in adult animals. For example, the DPP gene product is a determinant of the dorsal-ventral pattern formation of the fly, and null mutations have been shown to be lethal (17). Inhibins and activins control secretion of folliclestimulating hormone (3,5). Moreover, activins A and B show mesoderm-inducing activity (18)(19)(20). BMP-4 also induced mesoderm in a similar assay (21). MIS causes regression of the Mullerian duct during male development (2,22). Recombinant BMP-2 (23), BMP-4 (24), and OP-1 (25) induce new bone formation in a subcutaneous implant assay in rats (26). Recombinant OP-1 homodimers, produced in mammalian cell culture, have been effective in the repair of segmental bone defects in rabbit ulna (25,27) . A major site of OP-1 expression was found to be the kidneys (28). Expression of a bone growth factor in the kidneys is consistent with their role in calcium regulation and bone homeostasis. The original source of osteogenic protein, the bone tissue, contains several TGF-p-related proteins as evidenced by protein sequence analysis of purified osteogenic activity (6,29). The OP-1 gene and other BMPs were cloned using a consensus probe, based on partial protein sequence information and conserved elements of the TGF-@ family (9). Screening of cDNA libraries with OP-1 probe led to discovery of OP-2, which is described here.

MATERIALS AND METHODS
Library Screening-All libraries were screened by an initial plating of 1 X lo6 plaques (approximately 5 X lo4 plaques/plate) and hybridizations were in 40% formamide, 5 X SSPE, 5 X Denhardt's solution, and 0.1% SDS at 37 "C. Nonspecific counts were removed in 0.1 X SSPE, 0.1% SDS by shaking at 50 "C. Murine OP-2 cDNA was found in a 17-day mouse embryo 5"stretch cDNA library (X gtl0, ML1029a, Clontech, Palo Alto, CA) and also in a teratocarcinoma (PCC4) cDNA library (ZAPII, 936301, Stratagene, La Jolla, CA). The complete human OP-2-coding sequence was derived from a cDNA clone in a human hippocampus cDNA library (X ZAPII, 936205, Stratagene; from brain of a normal 2-year-old girl) and a genomic clone (EMBL-3 library, HL1067J, Clontech). The latter library also yielded the other clones used to determine the human OP-2 genomic locus. The genomic structure of human OP-1 was determined using two different libraries (Clontech, HL1067J; Stratagene, 946203). For identification of the splice junctions of OP-1 the following subclones were prepared. Exon 1 of OP-1, with both splice junctions, was subcloned as a 1.7kb PstI fragment (~0142-12) and exon 2, with splice junctions, was cloned as a 0.6-kb HaeIII fragment (~0247-5). Part of the third exon and upstream junction was cloned as a 1.7-kb Pstl fragment (~02431, whereas the downstream junction was cloned as a 0.3-kb Sau3AI fragment (~0 2 4 8 ) .
DNA Sequencing-All sequencing was done according to Sanger et al. (30) using exonuclease 111-mediated unidirectional deletion (31), 25220 subcloning of restriction fragments, and synthetic primers. The coding regions of human and murine OP-2 DNA were sequenced on both strands. Compressions were resolved by performing reactions at 70 "C with Taq polymerase and using 7-deaza-GTP (U. S. Biochemical Preparation of RNA and Northern Blot Analysis-Mice, strain CD-1, and rats (Long-Evans) were from Charles River Laboratories, Wilmington, MA. Total RNA from mice was isolated by the acid guanidine thiocyanate-phenol-chloroform method (32). Poly(A)+ RNA was analyzed on 1.2% agarose-formaldehyde gels and blotted onto Nytran membranes (Schleicher & Schuell) with 10 X SSC (32). Hybridization conditions with 32P-labeled (33) probes were as described (28). Between hybridizations filters were deprobed in l mM Tris-HC1, 1 mM EDTA, 0.1% SDS, pH 7.5, at 90-95 "C and exposed to film to assure complete removal of probe.
Cloning of Vgr-1 and GDF-1 cDNA and BMP-5-specific cDNA Fragments-A partial Vgr-1 clone, lacking some 300 nucleotides at the 5'-coding region was isolated from a mouse brain cDNA library (Clontech, ML1036a) using an OP-1 cDNA probe. A murine GDF-1 cDNA spanning the complete coding region was isolated by PCR. Oligo(dT) was used to prime first-strand synthesis from 200 ng of mouse brain poly(A)+ RNA. Reactions were in a 100-pl volume with 5 p1 of 20 X PCR reaction buffer (1 M Tris-HC1, pH 9.0, 400 mM ammonium sulfate, 30 mM magnesium chloride), 2 p1 of 10 mM dNTPs, 100 pmol of oligo(dT), 200 units of Moloney murine leukemia virus reverse transcriptase for 60 min at 37 "C. Reactions were incubated for an additional 30 min at 37 "C with 5 units of RNase H (GCGCAAGCTTGGACACCTCCTGGGAGG) and 3'-GDF (GGAA-(GIBCO-BRL) followed by 5 min at 95 "C. For amplification, 5'-GDF TTCCTCAACGGCAGCCACACTCATC) primers were added to 0.5 p~. The cDNAs with 2.5 units of Replinase (Du Pont-New England Nuclear) were subjected to 1 cycle of denaturation at 94 "C for 2 min, annealing at 45 "C for 2 min, polymerization at 72 'C for 20 min, followed by 35 cycles of denaturation at 94 "C for 1 min, annealing at 55 "C for 2 min, and polymerization at 72 'C for 2 min. For mouse BMP-5, a 287-bp fragment was isolated by PCR from mouse embryo mRNA under similar conditions as described for GDF-1. A 3'-BMP-5-specific PCR primer (CCATGTCAGCATCATTCAG) was used for first strand synthesis. A 5'-BMP-5-specific PCR primer (CCAGAC-CATTTTCACCTG) was added for PCR amplification. PCR fragments were gel purified and cloned into a Bluescript KS(-) vector. Corp.).

RESULTS
Cloning of Murine and Human OP-2 cDNA-In an effort t o isolate additional OP-1 related genes, we screened several mouse cDNA libraries, using an OP-1 probe derived from the TGF-/3 domain (0.32-kb StuI-EcoRI fragment). A 17-day mouse embryo cDNA library yielded not only the murine homolog of the human OP-1 gene (28) but also the new gene, termed OP-2. Only one OP-2 clone was found compared to four of OP-1, indicating a low abundance at the 17-day stage. A murine OP-2 clone was also found in a teratocarcinoma (PCC4) cDNA library.
The human OP-2 gene was isolated from a hippocampus cDNA library, previously the source of human OP-1 cDNA, by screening with murine OP-2 probe specific for the proregion (0.3-kb EcoRI-BamHI fragment). A positive clone (X 024 and subclone p0166) shared extensive sequence homology with murine OP-2. It contained 0.4 kb of 5"noncoding sequences, the complete pro-region, and the first half of the TGF-@ domain but lacked 0.14 kb from the C terminus. The last portion of the TGF-/3 domain was obtained from a human genomic library (EMBL-3, Clontech): X 028 contained the last four exons of human OP-2 and provided the missing part of the TGF-/3 domain on a BamHI-PstI fragment of 0.8 kb (~0173-2). Analysis of exons 4, 5, and 6 from this genomic clone revealed nucleotide sequences identical to the respective regions in the human cDNA. Subsequently, the entire human genomic OP-2 locus was isolated.
Based on cDNA and genomic DNA sequence, the predicted size for human pre-pro OP-2 is 402 amino acids (Fig. 1). The putative secretion signal peptide contains 19 amino acids and C P Q R R L G A R E R R ; D I Q R E I L ' A V L G L P C R P R P I 60 59 Alignments of the TGF-@ Domains of OP-2 and Related Proteins-A distinctive feature of OP-2 is the presence of an additional cysteine in the TGF-@ domain, which corresponds to a tyrosine residue in OP-1. OP-2 is the only TGF-@-related protein that breaks the pattern of 7 conserved cysteines in the TGF-@ domain, as seen in the alignment with OP-1, BMPactivins A and B, and TGF-Pl (Fig. 2). This alignment places OP-2 near OP-1, Vgr-1, BMP-5, and also 60A. Similarities, as percentage of identical amino acids, are compiled in Table  I. OP-2 is equally related (in its TGF-@ domain) to OP-1 (74%), Vgr-1 (BMP-6) (75%), and BMP-5 (76%) and also very close to the Drosophila 60A protein (65%). It is notably more distant from BMP-2 and BMP-4 (55%), which have arginine rather than histidine at the C terminus (Fig. 2). A glycosylation site in the center of the TGF-0 domain is shared by OP-2, OP-1, BMP-5, BMP-6, 60A, DPP, BMP-2, and BMP-4, but absent in BMP-3, and more distantly related proteins.

Q E P H W K E F 8 F D L T Q I P A G E A V T A A E F R I Y K
Comparison of Human and Murine OP-2"Members of the TGF-@ superfamily typically show extensive conservation across species. The alignment of human and mouse OP-2 in (Fig. 1) shows only 4 amino acid changes in the TGF-@ domain while the mature proteins differ by 13 amino acids. In the pro-region, murine OP-2 is 3 amino acids shorter than human OP-2. Some diversion is found near residue 90 of OP-2, where OP-2 and OP-1 diverged the most (Fig. 3). Overall, a total of 59 aa changes between human and murine pre-pro OP-2 5, Vgr-1, BMP-2, BMP-3, BMP-4, DPP, 60A, Vg-1, GDF-1,  significantly exceed the 11 changes between human and murine OP-1 (28). Diminished conservation across species is also found with BMP-3 and GDF-1 (10).

OP-2, a New Member of the TGF-fi Superfamily
Comparison with OP-1-Alignments of human OP-2 with OP-1 ( Fig. 3) may be used to decipher the structure and possible role of the pro-regions and to confirm the deduced sequences. OP-2 has only marginal homology with OP-1 in the signal peptide and adjacent pro-region (the first 33 residues). A nearby RXXR protease site of OP-1 is replaced in OP-2 by RXXG, at residue 37. However, a substitute RXXR pattern appears, shifted by 5 amino acids.
The pro-region of OP-2 and OP-1 share a potential glycosylation site (present also in BMP-5 and -6). While OP-l lacks cysteines outside of the TGF-P domain, OP-2 contains a cysteine in the signal peptide and a second one at residue 31. A third cysteine is at residue 205 in the pro-region of human but not murine OP-2.
Proteolytic Maturation Sites in the Pro-domain-The proteolytic cleavages resulting in removal of pro-regions from the mature proteins occur immediately past the sequence FXXR in members of this family (7, 16, 28). Mutants of OP-1 with a minor alteration of the RXXR sequence are not properly cleaved? Several genes of this family encode multiple RXXR patterns in the pro-region. For example, human OP-2 has three additional RXXR sites in the pro-region. N-terminal Kaplan, P., Dorai, H., Ozkaynak    mature N termini, up to the first cysteine of the TGF-/3 domain, for different superfamily members leads to interesting observations, regarding the length and composition (Fig.  4). Long N termini are found for the closely related OP-1, OP-2, and BMPB (37 aa) and BMP-5 (36 aa). N-terminal residues of BMP-5, BMP-6, and OP-1 differ mainly by conservative changes. The alignment of charged residues is nearly perfect for these proteins, and somewhat less so for OP-2. OP-2 lacks two potential glycosylation sites found in the mature N terminus of OP-1, BMP-5, and BMP-6.
The mature N termini of OP-1, OP-2, BMP-5, BMP-6, BMP-2, BMP-4, BMP-3, and even DPP and 60A from Drosophila, share a high content of basic amino acid residues. This feature is found in all members of the superfamily that have osteogenic potential. It is absent in others, such as the TGF-/3 group, the activins, and MIS.
OP-2 mRNA Expression-In order to detect OP-2 expression we screened mRNA preparations from several organ tissues of 2-day-old rats, such as brain, calvaria, heart, kidney, and lung by Northern blot hybridization (Fig. 5). For this analysis brain was chosen since human OP-2 was isolated from a hippocampus cDNA library, calvaria were chosen as representative of bony tissue, and kidneys are a source of OP-1 mRNA (28). However, no OP-2 mRNA was detected in this analysis. As control we also probed with related genes in successive hybridizations and were able to detect mRNA of OP-1, Vgr-1, and BMP-4.OP-1 was found mainly in kidneys, Vgr-1 and BMP-4 mainly in lungs. A more detailed analysis is described below (see Fig. 7).
Since OP-2 cDNA was discovered in a mouse embryo cDNA library we then analyzed OP-2 mRNA expression in 8-, lo-, and 17-day mouse embryos as well as 6-day postnatal animals by Northern blot analysis of poly(A)+ RNA. Extensive OP-2 mRNA expression was found in 8-day embryos, the message fell drastically in 10-day embryos, and was virtually absent in 17-day embryos (Fig. 6). The Northern result is consistent with the low abundance of OP-2 cDNA clones in a 17-day mouse embryo library (one clone in lo6). Extremely low levels were detected in 6-day postnatal animals and kidneys from 2week-old animals.
mRNA Expression of Other TGF-/3 Superfamily Members in Specific Organs-In order to detect OP-2 expression in grown animals we screened several mouse organs by Northern hybridization. We extended this analysis to eight TGF-/3related genes (OP-1, OP-2, Vgr-1, BMP-2, BMP-3, BMP-4, BMP-5, and GDF-1) as no stringent or comprehensive study of mRNA expression is found in the literature. Cross-hybridization was minimized by using specific probes corresponding to highly diverged sequences.
Poly(A)+ RNA from brain, spleen, lung, heart, liver, and kidney of 2-week-old and 6-9month-old male mice was analyzed on 1.2% agarose-formaldehyde gels, transferred to a membrane and hybridized sequentially to different probes (Fig. 7).
Clear results were obtained for most genes but no mRNA was detected in any organs studied for OP-2 (data not shown) and BMP-2. OP-1 message is mainly present in the kidneys and at a lower level in brain tissue as a total of four species (1.8,2.2,2.4, and 4.0 kb) (28). BMP-6/Vgr-1 mRNA, a single species of about 3.7 kb, is found in lungs and at low level in kidneys. BMP-5 mRNA is located in lungs and liver as a single species of 4.2 kb. BMP-4 mRNA is mainly in lungs and at much lower levels in kidneys (two species of 1.8 and 2.1 kb). BMP-4 specific signal was also detected in spleen and liver, however, migrating slightly different. Possibly this sig-OP-2, a New Member of the TGF-@ Superfamily nal results from cross-hybridization with an unknown BMP-4 related gene. BMP-3 mRNA is observed in lungs (two species of 2.5 and 7 kb and a minor species of 3.2 kb). GDF-1-specific message of 3 kb is found only in brain, in accordance with Lee (10).
Age-dependent mRNA Expression-An age-dependent mRNA expression is not only observed for OP-2 but also with related genes: OP-1 mRNA in the kidneys of young animals (2 weeks old) is approximately twice the level of adult animals (6-9 months old) (Fig. 7). Similarly, BMP-5 mRNA is detected in the lungs and liver of young animals, much more so than in adult animals. For Vgr-1 and BMP-4, a reverse pattern of age-dependent expression is observed in the lungs, with levels approximately three times higher in adult over young animals. BMP-3 and GDF-1 mRNA levels in lungs and brain remained unchanged with age.
Presence of Multiple or Large Transcripts-The presence of oversized and multiple mRNA transcripts of OP-2, two species of 3 and 5 kb in 8-day embryos, has ample precedence among the other related genes. Most members of the superfamily have transcripts much larger than the expected 1.4-1.8-kb size and several have different size transcripts (Fig. 7). Northern blots showed four OP-1 mRNA species: 1.8, 2.2, 2.4, and 4 kb (28). BMP-3 also has multiple size transcripts (2.5 and 7.0 kb). BMP-5 and Vgr-1 (8) have single size transcripts of about 4 kb, much longer than expected.
Genomic Structure of Human BMP-4"Screening of a human genomic library with a consensus gene probe (9), resulted in isolation of the entire genomic BMP-4 in one clone (X 18) on two EcoRI fragments (p050, p044). The complete BMP-4 coding region is present in only two exons, 0.37 and 0.86 kb, interrupted by an intron of 0.96 kb (Fig. 8).

DISCUSSION
The present study describes a new gene which is closely related to OP-1, BMP-5, Vgr-1, and Drosophila 60A. Based  on the extensive homology with other osteogenic/bone morphogenetic proteins that are active in the rat subcutaneous implant assay, it is expected that OP-2 has osteogenic potential. The genomic structures of OP-1 and OP-2 both contain seven exons, spread over large distance with matching exonintron boundaries, thus setting them apart from from BMP-2,BMP-4,and BMP-3. OP-2 is uniquely marked by an 8th cysteine in the "1cysteine domain" of the TGF-P superfamily. Since the mature members of the TGF-P superfamily are dimeric, this extra cysteine may participate in and stabilize the dimer formation. Recently the crystal structure of TGF-@ has been published (34,35), and a single intermolecular disulfide bridge has been identified. Substitution of the respective cysteine in OP-2 by serine, for example, by site-directed mutagenesis might show whether the additional cysteine can preserve the dimer structure.

OP-2, a New Member of the TGF-P Superfamily
A close inspection of the proteolytic maturation sites of several precursors shows that the pattern RXXR can be further specified as RXXRJA or RXXRJS, which are found in the majority of cases. In OP-2 the TGF-p domain follows two potential maturation sites, 21 residues apart. Multiple RXXR sites near the TGF-p domain are also seen in BMP-2 and BMP-4, about 30 residues apart. Additional proteolytic maturation sites are found further upstream in the pro-region. In human OP-2 the multiple maturation sites could release five potential polypeptides, spanning from amino acid 20 (signal peptide cleavage site) to aa 42, from 43 to 101, from 102 to 242, from 243 to 263, and the mature C-terminal domain, with the possibility of additional biological activities.
The mature N termini of different members of the TGF-/3 superfamily are quite diverse in their amino acid sequence. The extensive variations found in the mature N termini may provide distinction between the otherwise conserved mature proteins. Possibly this region has diverged because it is not crucial for receptor binding or protein folding. However, for a given protein this region is still relatively well conserved among animal species, in support of a functional role. Nterminal sequences may supply the individual proteins with more specific recognition for their respective receptors. They may allow targeting of specific tissues via binding to extracellular matrix. The N terminus of OP-2 has some similarity in composition to nuclear localization sequences that contain several prolines interspersed with lysines or arginines (36), a feature not seen for the other osteogenic proteins.
The high content of basic amino acids in the N terminus is characteristic for the osteogenic proteins. In contrast, a mature N terminus of only 10-14 amino acids with few basic residues is characteristic for the more distantly related proteins, TGF-p 1-5, the activins, inhibin a, and MIS. The positively charged N termini of the osteogenic proteins may permit binding of the proteins to hydroxylapatite, a resin traditionally used during purification of the osteogenic protein (29). I n uiuo, the basic N termini may mediate the deposition of osteogenic proteins in bony tissue. However, in the subcutaneous bone induction assay the N termini are not essential for biological activity (25) perhaps due to the local administration of the protein, immobilized on collagen matrix. The insect proteins, DPP and 60A, also display positively charged N termini, even though their role is not bone induction.
The analysis of mRNA species for osteogenic proteins has been difficult due to the extremely low level of natural expression of these proteins, and published data are sparse. Analysis by dot blot or by in situ hybridization is confounded by the possibility of cross hybridization of probes to different members. In our study we have minimized this possibility by careful choice of probes. Selection of poly(A)+ RNA on oligo(dT)-cellulose is necessary due to low levels of expression of these mRNAs. The low level of expression may be a reflection of regulatory function and high biological activity. Nanogram amounts suffice for subcutaneous bone induction in the rat model (29).
Northern analysis of several organs with probes for various family members indicates that mRNA for most of them is expressed in specific organs. OP-1 mRNA is expressed mainly in kidneys and bladder (28), which may explain the epithelial osteogenesis, discovered by Huggins over 60 years ago. Huggins (38) noted that urinary tract epithelia implanted into the abdominal wall of dogs evoked large amounts of bone within 12 days. OP-1 mRNA is also expressed in brain which is the sole site of GDF-1 expression. In contrast, we find BMP-3, BMP-4, BMP-5, and BMP-G/Vgr-l to be primarily expressed in the lungs. The lungs may participate in the growth regu-lation of bone and connective tissues in an endocrine manner, as proposed for the kidney (28).
OP-2-and BMP-%specific mRNAs were not detected in any of the adult organs by our Northern hybridization analysis. OP-2 cDNA was found at low abundance in a hippocampus cDNA library and may be expressed at low levels in brain. However, OP-2 mRNA was found at relatively high levels in 8-day mouse embryos, indicating a developmental role; in the adult animal OP-2 and BMP-2 may be expressed in a more discrete location, or primarily during tissue regeneration. While the timing of expression for BMP-5 and OP-1 seems to be directly related to growth, an inverse relationship was found for Vgr-1 and BMP-4. The level of BMP-3 expression (in lungs) did not change with the age of the animal.
The early embryo displays two oversize OP-2 mRNA species. Hence, the OP-2 locus may be considerably larger that the 27 kb cloned so far. The multiple size transcripts observed for OP-1,OP-2, BMP-3, and BMP-4 may result from splicing events that affect coding or untranslated regions, or they may represent bicistronic mRNA species (10). Diverse transcripts have also been seen with DPP. These multiple DPP-specific mRNAs have been shown to be due to alternatively spliced 5"untranslated exons (37).
There is an apparent redundancy of osteogenic proteins in the TGF-/? superfamily. Since bone has different architectures and fine structures depending on the anatomical localization (for example, long bones, facial bones, skull plates, and dentin), the osteogenic proteins may have evolved along with the different types of bone. Roles other than bone formation are likely since analogs of the osteogenic proteins have been found in invertebrates (DPP and 60A of Drosophila). Multiple functions are indicated by the fact that the bone morphogenetic proteins are expressed in early embryonic development as well as later in life.