High Levels of Expression of Full Length Human Pro-cu2(V) Collagen cDNA in Pro-cu2(V)-deficient Hamster Cells*

A full length cDNA encoding human pro-aZ(V) col- lagen was constructed. Partial sequencing of the cDNA and primer extension analysis of mRNA from fibro- blasts found that pro-aZ(V) mRNA differs from the mRNAs of other fibrillar collagens in the increased length of its 5“untranslated region. The pro-a2(V) eDNA was placed downstream of the human cytomeg- alovirus immediate early promoterlregulatory sequences for expression studies in cultured Chinese hamster lung cells. These cells have been shown previously to synthesize large quantities of pro-al(V) homotrimers as their only collagenous product. Transfection resulted in a number of clonal cell lines that express human a2(V) RNA at levels comparable to, and in some cases greater than, levels found in normal human skin fibroblasts. Pro-aZ(V) chains produced in the majority of clonal lines were of sufficient quantity to complex all available endogenous pro-al(V) chains. Chimeric heterotrimers, of hamster al(V) and 2:l ratio, stable ciated with the cell layer.

The 12 or more types of collagens identified thus far comprise the major structural components of the extracellular matrix and together represent approximately 30% of total body proteins in humans. Type I collagen, the major fibrous component of connective tissue, is the most abundant protein in the body. The more recently described type V collagen is widely distributed in many tissues as a pericellular component. Type V collagen is also frequently found closely associated with type I collagen as a component of interstitial fibers and may be involved in establishing the physical characteristics of such fibers (for a review, see Ref. 1). Partial characterization of sequences and intron/exon organization of the gene encoding the pro-aZ(V) chain has demonstrated a close evolutionary relationship with the genes encoding collagens types 1-111 (2)(3)(4). Thus, type V collagen is now grouped with collagens 1-111 as a fibrillar or interstitial collagen. However, type V collagen is less abundant than collagen types  I-IV ( 5 ) , and the low levels of this protein present in tissues and cell cultures have limited both biochemical and molecular analyses. The function(s), chain composition, and processing of type V collagen remain, therefore, poorly characterized.
The most widely distributed form of type V collagen is formed through the association of one pro-aZ(V) and two prod ( V ) chains into a heterotrimer. A homotrimer comprised of a l ( V ) chains has also been identified (6), and two other relatively uncharacterized type V cw chains have been described (1). Homotrimers of aZ(V) chains have been reported (7,8), however renaturation studies have shown pepsin-derived a2(V) chains to be incapable of reassociating into stable homotrimers in vitro (9).
Here we describe construction of a full length cDNA encoding the human pro-a2(V) chain and its expression, under control of the human cytomegalovirus immediate early (HCMV-IE)l promoter/regulatory sequences, in cultured Chinese hamster lung cells (CHL). These cells produce proal(V) homotrimers as their only endogenous collagen (6,lO-13). We show that human pro-aZ(V) chains are produced at high levels in transfected cells and form stable heterotrimers with hamster pro-d(V) chains which are incorporated into matrix associated with the cell layer. This transfection system, combined with site-specific in vitro mutagenesis of the cDNA, may provide a new means for dissecting the complex biosynthesis of type V collagen. formed through hybridization of two overlapping 44-base oligomers, was double-stranded except for a 4-base overhang at its 5' end for ligation to the EcoRI site at the 3' end of the a2(V) insert of pBSL18. There was also a 4-base overhang at its 3' end for ligation to a BglII site upstream of the SV40 small t splice site (Fig. 3). The full length pro-a2(V) cDNA was placed upstream of the SV40 small t splice site and early polyadenylation signal and downstream of the HCMV-IE promoter/regulatory sequences isolated from expression vector pCAT wt 760 (15). The resultant construct, pGGH31 (see Fig. 3

20683
Cloning and Expression of Full Length Pro-a2(V) Collagen cDNA in subsequent expression studies. Oligonucleotides were synthesized at the University of Wisconsin Biotechnology Center. Cell Culture and Transfection-CHL cells (clone HT1) have been described previously (6,(10)(11)(12)(13). AHlF cells are normal neonatal foreskin fibroblasts (16). All cells were cultured in Dulbecco's modified Eagle's medium with 10% fetal calf serum. Cotransfection of CHL cells with pSV2neo and pGGH31 was by calcium phosphate precipitation as described (16), except that clonal lines were selected in 0.8 mg/ml G418 (GIBCO).
Isolation and Analysis of Nucleic Acids-Cytoplasmic RNA was isolated by disruption of cells in isotonic lysis buffer in the presence of Nonidet P-40 (Shell Chemicals), with removal of nuclei by centrifugation as described (17). Polyadenylated RNA was selected from cytoplasmic RNA by chromatography on (dT)12-15 cellulose (type 2; Collaborative Research, Inc.) (17).
For primer extension, a 22-base oligomer complementary to sequences from 112 to 133 bases upstream of the translation initiation codon of pro-aZ(V) mRNA (see Fig. 1A) was 5' end labeled with polynucleotide kinase and hybridized to 3 fig of polyadenylated RNA from the cytoplasm of AHlF cells. RNA.DNA hybridization was overnight at 52 "C. Otherwise, hybridization conditions and extension with avian myeloblastosis virus reverse transcriptase (Life Science) were as described previously (18). DNA sequences were obtained by dideoxy chain termination (16). Isolation of Radiolabeled Collagen and Procollagen-Radiolabeling and isolation of collagen species for pepsin digestion were as described (16) except that precipitation from medium was achieved with 30% saturated (NH4)2S04. Digestion with 100 pg/ml pepsin was in 0.5 M acetic acid (pH 2.0) for 6 h at 4 "C. Type V procollagen, which was not to be pepsinized (see Fig. 6), was extracted from the extracellular matrix associated with the cell layer by sequential extraction with 1% deoxycholate followed by 4% SDS (19). Cell layers were scraped into 1% deoxycholate, vortexed, and centrifuged at 100,000 X g for 30 min. Deoxycholate-insoluble material was rinsed in 1% deoxycholate and then solubilized by boiling in 4% SDS gel buffer prior to electrophoresis.

RESULTS
Sequence Analysis-The full length cDNA clone presented here contains 102 bp more 5"untranslated sequence than reported previously (4) (Fig. U). In addition, DNA sequences at the 5' end of the full length pro-a2(V) clone differ from recently published a2(V) sequences derived from partial length cDNAs (3,4). Of particular interest is a thymidine residue, not appearing in sequences reported previously, which comprises part of an ATG codon upstream of the translation initiation codon of pro-a2(V) (Fig. 1B). This ATG and another ATG upstream of the pro-a2(V) initiation codon are both followed by short open reading frames potentially encoding two tetrapeptides (Fig. 1B). Thus, pro-a2(V) mRNA is similar to the mRNAs of pro-al(I), pro-a2(1), and pro-al(II1) in that all contain at least two AUGs followed by short open reading frames upstream of the AUG which initiates translation of the procollagen chain (20). The additional thymidine residue also forms part of an inverted repeat surrounding the translation initiation codon of pro-a2(V). This inverted repeat is almost perfectly conserved in similar positions in the mRNAs of pro-al(I), pro-a2(1), and pro-al(II1) chains and may form a stem-loop structure in which the positions of the two AUG codons are invariant, with the second AUG being the procollagen translation initiation site (Ref. 20 and Fig.  1B).
Amino acids and nucleotides in the amino-terminal part of the main triple helical region which differ from sequences reported previously are given (Fig. IC). These include a Gly-X -Y triplet missing from the sequence published previously a2(V) cDNA clone reported here, however, contained 241 bases corresponding to 5"untranslated sequences. T o confirm that the 5'-untranslated region of pro-a2(V) mRNA is longer than those of the other fibrillar collagen chains, a synthetic DNA oligomer (Fig. 1) was annealed to polyadenylated RNA isolated from the cytoplasm of normal human diploid fibroblasts and extended with reverse transcriptase. The result was a major extension product of about 291 bases and a much less abundant product of about 240 bases (Fig. 2). This indicates that the major species of pro-a2(V) mRNA in fibroblasts from human skin have 5"untranslated regions at least 402 bases long.
Expression of Human a2(V)-specific RNA in Transfected CHL Cells-The full length pro-a2(V) cDNA was placed downstream of the immediate early promoter/regulatory region of human cytomegalovirus (Fig. 3). These HCMV-IE regulatory sequences are significantly more active in directing transcription of transfected genes in a variety of eukaryotic cells than the Rous sarcoma virus long terminal repeat' (24), which had previously been used in this laboratory to direct transcription of pro-a2(1) collagen cDNA (16). In order to provide pro-a2(V) cDNA transcripts with the ability to splice and be polyadenylated, the expression vector was furnished with the SV40 small t splice site and early polyadenylation signal. The resultant recombinant (pGGH31) was cotransfected with the selectable marker pSV2neo (25)   White, black, and hatched bars represent parts of the cDNA-encoding untranslated regions, terminal propeptides, and the main triple helical region, respectively. The strippled bar represents SV40 sequences including the small t splice site ( I V S ) and early polyadenylation signal. A,ApaI; B, BamHI; Bg, BglII; E, EcoRI; Hc, HincII; H, HindIII; K , KpnI; N, NcoI; S, SacI. Arrows beneath the figure represent sequenced fragments. SI, the NcoI-EcoRI fragment used as the S1 probe; asterisk denotes the "P-labeled 3' end; and the wauy line shows 153 bp from the pUC9 vector.
analyzed by the S1 nuclease protection assay for the presence of human a2(V)-specific transcripts (Fig. 4). Of the 28 lines tested, 17 of which are shown (Fig. 4), the RNAs of 23 were found to protect a 393-base DNA fragment diagnostic for human a2(V) sequences. Surprisingly, some transfected CHL lines (Fig. 4, lanes 9 and 17) were found to produce higher levels of human a2(V) RNA than do normal human diploid fibroblasts (AHIF, Fig. 4). As expected, no protected fragment resulted from probe hybridized to the cytoplasmic RNA of untranslated CHL cells (Fig. 4).

Association of Human a2(V) Chains with Hamster al(V)
Chains in Cell Layers-The six clonal lines found by S1 analysis to contain the highest levels of human a2(V)-specific RNA were metabolically labeled with [3H]proline and their media and cell layers analyzed separately for the presence of collagen species. As reported previously (6,13), untransfected CHL cells produce only al(V) homotrimers that are localized to the cell layer (Fig. 5A, lane 1 ). In contrast to untransfected CHL cells, the cell layers of clonal lines that had been found Cytoplasmic RNA (25 pg) from AHlF, CHL, or one of 17 clonal lines resistant to G418 was hybridized a t 59 "C to a double-stranded DNA probe specific for human n2(V) and digested a t 20 "C with S1 nuclease as described (17). The probe extends from an NcoI site to an EcoRI site in the region of the human cDNA which encodes the main triple helical region (see Fig. 3). Appended to the 393 bp of n2(V) sequences is a 153-bp "tail" of pUC9 sequences which provides a size difference to distinguish protected DNA fragments from undigested probe. M, MspI-digested pBR322; P, undigested probe. previously to produce high levels of human a2(V)-specific RNA are shown to contain both pepsin-resistant al(V) and a2(V) chains (Fig. 5A, lanes 2-7). Densitometric scanning of autofluorograms, exposed for varying lengths of time, gave ratios of a1(V) to a2(V) very closely approximating 2:l for five of the six lanes containing the cell layer-associated collagens of clonal lines (Fig. 5A, lanes 2-6). The ratio of al(V) to a2(V) chains in the seventh lane was 5.2:l.
The medium of untranslated CHL cells is shown to contain low levels of a pepsin-resistant species (Fig. 58, lane 1) that has been reported previously to be an approximately 85-kDa proteolytic cleavage product of a l ( V ) chains (10). In contrast, the media of clonal lines that produce human a2(V) chains contained small amounts of full length al(V) chains (Fig. 5B,  lanes 2-7). A second pepsin-resistant band found in the media of these cells (Fig. 5b, lanes 2-7) appears to be slightly larger than the putative al(V) proteolytic fragment found in media of untransfected CHL cells (Fig. 5B, lane 1 ) and to comigrate with a2(V) chains. Thus, it is likely that the pepsin-resistant species in the media of clonal lines includes small quantities of heterotrimers. However, the lower band is rather broad and is more abundant than the al(V) band. It therefore probably contains both a2(V) chains and a1(V) proteolytic fragments that have not been resolved during electrophoresis.
Analysis of Unpepsinized Collagen Species-In order to analyze procollagen species, unpepsinized samples from the extracellular matrix and media of a clonal line, designated Al, which had been found previously to produce the highest levels of tr2(V) RNA (Fig. 4, lane 17) and protein (Fig. 5A, lane Z ) , were analyzed by SDS-polyacrylamide gel electrophoresis (Fig. 6). Unexpectedly, the media of these cells contained abundant pro-n2(V) chains that had not been incorporated into matrix. Based on the results of Fig. 5, the vast majority of these media pro-n2(V) chains are not stable to limited pepsin digestion. The extracellular matrix, isolated by sequential extraction of the cell layer in deoxycholate and SDS (19), was found to contain pro-nl(V) and pro-n2(V) chains in the ratio 1:1.7. This implies that in addition to pepsin-resistant heterotrimers found in the cell layer (Fig. 5A, lane Z ) , a small proportion of pepsin-sensitive pro-n2(V) chains, not bound to pro-nl(V) chains, has been incorporated or trapped in the extracellular matrix. The absence of processed type V collagen chains in the extracellular matrix of Fig. 6 is consistent with the results of Fessler et al. (13) which showed CHL cells to lack specific procollagen peptidase activity. In contrast to other reports in which varying amounts of pro-n2(V) have been found disulfide linked to pro-nl(V) chains (l), unreduced gels did not reveal disulfide links between pro-n2(V) and protul(V) chains in the present study (data not shown).

DISCUSSION
In this study, we have constructed a full length cDNA encoding the human pro-n2(V) collagen chain in its entirety. Analysis of pro-n2(V) cDNA sequences indicates pro-a2(V) mRNA to contain a highly conserved inverted repeat and two short open reading frames around the translation initiation site similar to what has been reported for the mRNAs of other fibrillar collagen chains (20). However, pro-a2(V) mRNA is shown, by primer extension, to differ from the mRNAs of other fibrillar collagens in the greater length of its 5"untranslated region.
Pro-n2(V) cDNA, driven by HCMV-IE promoter/regulatory sequences, is shown to express a t high levels upon transfection into CHL cells. Levels of human a2(V)-specific RNA in the cytoplasm of transfected lines of CHL cells were comparable to, and in some cases greater than, levels of a2(V) RNA found in normal human diploid fibroblasts. Although production of endogenous pro-al(V) homotrimers has been shown previously to represent from 20 to 30% of the total capacity of these cells for protein synthesis (11,12), levels of human pro-n2(V) chains in most clonal lines examined were sufficient to have complexed all endogenous pro-al(V) chains.
The 2:l ratio of pepsin-resistant hamster nl(V) chains to human n2(V) chains found in the cell layers of a majority of clonal lines is most consistent with the interpretation that FIG. 6. Procollagen species produced by CHL and A 1 cell lines. CHL and A1 cultures were metaholically labeled with ["HI proline. Procollagen was isolated from the extracellular matrix (Mat.) associated with the cell layer (see "Materials and Methods") or from culture medium (Med.). Samples reduced with /3-mercaptoethanol were analyzed by electrophoresis on a 4.5% SDS-polyacrylamide gel and subsequent autofluorography. Each lone represents collagen isolated from half the total matrix or media harvested from a 6-cm dish of confluent cells. these chains form chimeric hamster/human ((a1 V),(cu2 V)) heterotrimers. Pepsin resistance indicates that these heterotrimers are in stable triple helical form. Moreover, the heterotrimers are predominantly incorporated into the extracellular matrix of the cell layer. This is characteristic of the close association reported for normal type V collagen and a variety of cultured cell types (13).
The 2:l ratio of cul(V) and n2(V) chains found in pepsinized material from cell layers is also consistent with i n vitro renaturation studies (9) and indicates that pro-a2(V) chains do not form pepsin-resistant triple helical pro-n2(V) homotrimers in vivo. Surprisingly, however, media samples were found to contain abundant amounts of pro-n2(V) chains that were not resistant to limited pepsin digestion a t 4 "C. This suggests that pro-n2(V) chains can be secreted in a form that does not have a compact triple helical configuration. This is in contrast to numerous studies with type I collagen which suggest that pro-n chains must be in a stable triple helical conformation for productive secretion to occur (26). It remains to be determined whether pepsin-sensitive pro-a2(V) chains in media exist as nonstable homotrimers or as individual chains. In this regard, it is of interest that a line of transformed Syrian hamster cells that do not synthesize proal(1) chains has been shown recently to secrete homotrimers of pro-n2(I) chains (27). These pro-a2(1) homotrimers were relatively unstable and were pepsin sensitive a t 15 "C. However, unlike the pro-a2(V) chains reported here, the pro-a2(1) homotrimers were triple helical and stable to pepsin digestion a t 4 "C.
The finding that small quantities of triple helical heterotrimers appear in the media (Fig. 5R) whereas intact triple helical pro-al(V) homotrimers are found exclusively in the cell layer suggests that pro-al(V) homotrimers may have greater affinity for the cell layer than heterotrimers containing the pro-a2(V) chain. The finding that pepsin-sensitive pro-aB(V) chains are predominantly secreted into media rather than retained in matrix suggests that pro-a2(V) chains, uncomplexed to pro-al(V) chains, have poor affinity for matrix or that pro-n chains that are not in a stable triple helical configuration are in general poorly incorporated into matrix.
The high levels of expression of pro-a2(V) cDNA achieved here are in contrast to lower levels achieved previously with pro-a2(1) cDNA in the W8 line of rat cells (16). One element contributing to this difference appears to be use of the HCMV-IE promoter/enhancer sequences instead of the Rous sarcoma virus long terminal repeat used in the pro-aB(1) study. In transient expression assays in which pro-a2(1) cDNA is driven by either the Rous sarcoma virus long terminal repeat or HCMV-IE promoter/enhancer in otherwise identical expression vectors, the HCMV-IE sequences yield higher levels of expression.2 Recently, it has been reported that the HCMV-IE promoter/regulatory sequences can obviate the need for introns in the efficient expression of immunoglobulin cDNA (28). Preliminary results suggest that in constructs containing either pro-aZ(1) or pro-a2(V) cDNA, the downstream SV40 small t splice site is used inefficiently, and only a small percentage of transcripts is spliced.2 Therefore, the positive effect of HCMV-IE sequences on expression may represent, to some degree, its ability to compensate for inefficient splicing in these constructs.
The high level functional expression of pro-n2(V) cDNA achieved in CHL cells provides a unique system for study of type V collagen. Site-directed mutagenesis of the cDNA, prior to transfection, may allow mapping of domains important in the biosynthesis and cellular functions of type V collagen. The enzymatic processing of type V procollagen into more mature forms, which remains somewhat obscure (l), might also be addressed in this system, provided that (i) specific procollagen peptidase activities shown to be defective in CHL cells (13) are supplied from culture media of normal fibroblasts; and (ii) the properties of the human/hamster heterotrimer are sufficiently similar to those of naturally occurring type V heterotrimers. Alternatively, transfection of pGGH31 into normal human fibroblasts should yield levels of expression which would be high relative to levels of endogenous type V. This should allow study of the effects of specific mutations of pro-a2(V) collagen in the context of cells that produce normal levels of peptidases and other collagen types. Use of appropriate cis-acting elements and introduction of mutagenized a2(V) clones into transgenic mice may help define physiologic and developmental functions for type V collagen and perhaps reveal a role for aberrant type V collagen in some inherited disorders of connective tissue.