The Effects of Different Cysteine for Glycine Substitutions within a2(I) Chains EVIDENCE OF DISTINCT STRUCTURAL DOMAINS WITHIN THE TYPE I COLLAGEN TRIPLE HELIX*

Affected individuals from two apparently distinct, mild osteogenesis imperfecta families were heterozygous for a G to T transition in the COLlA2 gene that resulted in cysteine for glycine substitutions at position 646 in the d ( 1 ) chain of type I collagen. A child with a moderately severe form of osteogenesis imperfecta was heterozygous for a G to T transition that resulted in a substitution of cysteine for glycine at position 259 in the COLlA2 gene. Type I collagen molecules containing an a2(I) chain with cysteine at position 259 denaturated at a lower temperature than molecules containing an a2(I) chain with cysteine at position 646. In contrast to cysteine for glycine substitutions in the al(1) chain, the severity of the osteogenesis imperfecta phenotype is not directly proportional to the distance of the mutation from the amino-terminal end of the triple helix. These findings could be explained if the type I collagen triple helix contains discontinuous do- mains that differ in their contributions to maintaining helix stability. Virtually

Affected individuals from two apparently distinct, mild osteogenesis imperfecta families were heterozygous for a G to T transition in the COLlA2 gene that resulted in cysteine for glycine substitutions at position 646 in the d ( 1 ) chain of type I collagen. A child with a moderately severe form of osteogenesis imperfecta was heterozygous for a G to T transition that resulted in a substitution of cysteine for glycine at position 259 in the COLlA2 gene. Type I collagen molecules containing an a2(I) chain with cysteine at position 259 denaturated at a lower temperature than molecules containing an a2(I) chain with cysteine at position 646. In contrast to cysteine for glycine substitutions in the al(1) chain, the severity of the osteogenesis imperfecta phenotype is not directly proportional to the distance of the mutation from the amino-terminal end of the triple helix. These findings could be explained if the type I collagen triple helix contains discontinuous domains that differ in their contributions to maintaining helix stability.
Virtually all forms of the heritable bone disorder osteogenesis imperfecta are due to heterozygosity for mutations in the structural genes for type I collagen (1,2). Mutations that result only in quantitafwe decreases in the synthesis of type I procollagen from cultured dermal fibroblasts result in a mild 01' phenotype (3-6) identical to 01 type I in the clinical classification of Sillence et al. (7). In contrast, heterozygosity for mutations that result in the synthesis and secretion of structurally abnormal type I procollagen molecules usually causes more severe 01 variants (8)(9)(10). The great majority of mutations in this latter group appear to result in substitutions for triple helical glycine residues in pro-d (1) or pro-a2(I) chains (11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22). Type I procollagen molecules containing al (1) or a2(I) chains with substitutions for triple helical glycine residues have increased post-translational modification (primarily excess hydroxylation and glycosylation of Y-position lysyl residues), usually have lower than normal denaturation temperatures, and are often less efficiently secreted than normal molecules (reviewed in Ref. 1). Within the group of 01 patients whose cells make structurally abnormal type I collagen, there is a wide range of clinical severity, from lethal 0 1 to mild osseous fragility. Some substitutions for glycine in al(1) chains exhibit a "position effect" in that the severity of an 01 clinical phenotype appears to be related to the closeness of the mutation to the carboxyl-terminal end of the triple helix (1, 20).
We report biochemical and DNA sequence analysis of two different cysteine for glycine substitutions in the triple helical domain of pro-a2(1) chains in three families with nonlethal variants of osteogenesis imperfecta. The more amino-terminal cysteine substitution results in significantly less stable type I collagen molecules than the other substitution and is associated with a more severe clinical phenotype. These data suggest that disruption of specific regions by glycine substitutions differentially affects the stability of the type I collagen triple helix.

Clinical Description of Affected Individuals
The families have been described more extensively by Cohn and Byers2 Family A-The proband suffered a congenital hip dislocation. She suffered a fractured tibia at 2 years of age and a fracture of the left humerus at 3 and 8 years of age. An affected sister suffered a hip fracture and within the next 2 years had two femur fractures. Their mother and maternal grandfather were also affected. 01 in all individuals was characterized by mild short stature and moderate fracture frequency with little deformity.
Family &"he proband is the only child of an affected father in a large family with a history of 01. At birth, there was marked bowing of the femurs and tibias, and a skeletal survey showed generalized demineralization. Overall, the severity of 01 in this family was comparable to family A, but more variable, with some obligate gene carriers having no history of fractures and others having deforming 01. However, even the most severely affected members of this family were less affected than the proband of family C.
Family C-The proband was the first child of parents who were clinically and biochemically normal. Birth weight was low for gestational age; physical examination at birth revealed a wide anterior

2590
fontanelle and marked shortness and bowing of the upper and lower limbs. Radiographs showed marked generalized demineralization. Two fractures were noted in the neonatal period.

Cell Culture
Cultured dermal fibroblasts were maintained in Dulbecco's modified Eagle's medium supplemented with 20% calf serum (Gibco Laboratories) in humidified 5% CO, at 37 "C.
Puke Labeling of Celk with PHIProline or P5SICysteine and Harvesting of Labeled Procollagens Cells (1 X lo6) were plated in 100-mm dishes and allowed to attach and spread overnight. Cells were preincubated for 4 h in RPMI 1640 Select-Amine (Gibco Laboratories) lacking either proline and hydroxyproline (for proline labeling) or cysteine (for cysteine labeling) and supplemented with 50 pg/ml ascorbate. After 4 h, the medium was removed and replaced with fresh medium and either 100 pCi of 2,3,4,5-[3H]proline (108 Ci/mmol) or 200 pCi of [35S]cysteine (>600 Ci/mmol) (Amersham Corp.) for 16 h. Procollagens were harvested from fibroblast medium into inhibitors as previously described (12) and then dialyzed three times in 2 liters of 0.5 N acetic acid prior to lyophilization.
Pepsin Digestion of Procollagens To produce collagen-sized molecules, procollagens were digested in 1.0 ml of 50 rg/ml pepsin and 0.5 N acetic acid at 4 "C for 16 h pepstatin was added, and samples were dialyzed three times for 5 h in 2 liters of 0.5 N acetic acid at 4 "C prior to lyophilization.

Melting Point Experiments
Thermal denaturation temperatures were determined using the following adaptation of the method of Bruckner and Prockop (23).
["HIProline or [35S]cysteine-labeled collagens synthesized by cells from affected members of each family were dissolved in 400 mM NaCl and 100 mM Tris (pH 7.4), and 45 pl was added to 0.5-ml microcentrifuge tubes. One tube of each sample was incubated at room temperature, and the other tubes were placed in a Perkin-Elmer Cetus thermal cycler set initially at 30 "C and programed to increase 1 "C every 12 min. One tube each of [3H]proline-and [35S]cysteine-labeled collagens was removed at the end of each 10-min plateau at 36-42 "C; the sample was cooled at 20 "C for 30 s and then digested with trypsin and chymotrypsin at final concentrations of 100 and 250 pglml, respectively, for 2 min. Digestion was stopped by addition of SDS and phenylmethylsulfonyl fluoride followed by boiling. Digestion products were separated by gel electrophoresis on 5% SDS-polyacrylamide gels (24) and were detected by autoradiofluorography (25) using EN3HANCE (Du Pont-New England Nuclear) as the fluor. Quantitation of a chains of type I collagen was performed by scanning gel densitometry using a Pharmacia LKB Biotechnology gel scanner, and peaks were quantitated using LIPS software (Spectrofuge Corp., Durham, NC).
Synthesis of Double-stranded cDNA Pro-a2(I) cDNA was synthesized as previously described (26). Total RNA was isolated by the method of Chomczynski and Sacchi (27) from cultured skin fibroblasts which had been preincubated for 7 2 h in medium supplemented with 100 p~ ascorbic acid, replaced daily. Poly(A) RNA was isolated from total RNA by oligo(dT) chromatography. Double-stranded cDNA was synthesized by the method of Gubler and Hoffman (28), except that a pro-a2(I)-specific primer (5'-GACTCCAGGACTACCCACAG-3'), complementary to residues 2867-2886 (29) of the pro-a2(I) collagen mRNA, was used for firststrand synthesis.
DNA Cloning and DNA Sequence Analysis cDNAs were amplified by the polymerase chain reaction (PCR) (30). For families A and B, peptide mapping had localized the abnormal cysteine between triple helical residues 358 and 776; and cDNAs were amplified using two oligonucleotides with the sequences 5'-CGAGGACCTAATGGAGATGC-3', corresponding to nucleotides 1429-1448 (residues 342-349 in the triple helix), and 5'-GCAC-CAGCAACACCAGGTAG-3', complementary to nucleotides 2804-2823 (residues 800-806 in the triple helix) (29). For family C, the abnormal cysteine had been mapped to cy chain residues 6-327 in the triple helix: and cDNAs were amplified by the sequences 5'-GTAACCTTATGCCTAGCAAC-3', corresponding to nucleotides 2591 178-197 (residues 15-21 in the amino-terminal propeptide), and 5'-TTCCAGCGGGGCCGATATTT-3', complementary to nucleotides 1506-1525 (residues 368-374 in the triple helix) (29). PCR primers used to amplify cDNAs from families B and C contained a 10-base sequence at their 5'-ends that provided EcoRI cloning sites after amplification. Amplified cDNAs were cloned into pUC18 vectors (Bethesda Research Laboratories). After alkaline denaturation of double-stranded clones (30), cDNA sequence analysis was performed by an adaption of the method of Sanger et d . (32) using a kit (Sequenase, U. S. Biochemical Corp.) according to the manufacturer's instructions.

RESULTS
Cohn and Bye& found that pro-a2(1) chains synthesized by cells from affected members of families A-C contain a cysteine residue within the triple helical domain, a region from which cysteine is normally excluded. Analysis of cyanogen bromide peptides of a2(I) determined that for families A and B, the cysteine resided between helical residues 357 and 775 and that for family C, the cysteine was located between residues 6 and 323 (Fig. la).
Pro-a2(I) cDNAs from RNA harvested from an affected member of family A were synthesized using gene-specific primers and were amplified by PCR (Fig. 1, B and C). PCR yielded a singe fragment with the expected size of 1.3 kilobases (data not shown). PCR products were cloned into the SmaI site of pUC18 and were sequenced in their entirety. Two of seven clones analyzed contained a T for G substitution at nucleotide 2341 that resulted in a cysteine rather than the normal glycine codon for triple helical residue 646 (Fig. 2,  upper). In addition, all seven clones differed from the published pro-a2(1) sequence at nucleotide 1583, resulting in an alanine (GCT) rather than a valine (GTT) codon at triple helical residue 420.
Pro-aB(1) cDNAs synthesized from mRNA harvested from cultured fibroblasts from an affected member of family B were amplified by the polymerase chain reaction, yielding a single product -1.3 kilobases in length (data not shown). PCR products were cloned into the EcoRI site of pUC18 using EcoRI sites contained in the PCR primers, and the clones were sequenced in their entirety. Two of four clones contained a T for G transition at nucleotide 2341 that resulted in a cysteine for glycine substitution at triple helical residue 646 (data not shown). All four clones also differed from published sequence at nucleotide 1583 as described above.
Pro-a2(I) cDNAs from RNA harvested from cells from the  proband of family C were amplified by PCR (Fig. l D ) , and the products were cloned. Three of seven clones contained a T for G substitution at nucleotide 1180 that resulted in a cysteine for glycine substitution at the codon for triple helical residue 259 (Fig. 2, lower).
To determine whether the position of the two cysteine for glycine substitutions had different effects on the stability of the triple helix, thermal denaturation temperatures of the type I collagen triple helix were assayed by digestion with trypsin and chymotrypsin (Fig. 3). T o compare the thermal denaturation of ["Hlproline-and ["%]cysteine-labeled proteins, the midpoint temperature (T,) for helix to coil transition was defined as the resistance of a2(I) chains in type I collagen molecules to protease digestion. For type I collagen molecules containing a2(I) chains with cysteine a t position 646, the T, was 39.5 "C; and for molecules containing a2(I) chains with cysteine a t position 259, it was 38 "C. The T, for type I collagen molecules from control cells labeled with ['HI proline was 40 "C.

DISCUSSION
Cultured dermal fibroblasts from affected members of two families with nonlethal 01 synthesized normal and abnormal populations of type I collagen molecules due to heterozygosity for a single nucleotide substitution in COLlA2 that resulted in cysteine for glycine substitutions a t different locations in the type I collagen triple helix. In two families, a mild 01 phenotype with osseous fragility, mild long bone deformity, and mild short stature was associated with heterozygosity for a cysteine for glycine substitution a t position 646 in the triple helix; in a third family, a single proband with severe deformity of long bones and dwarfing had a new dominant mutation that resulted in substitution of cysteine for glycine a t position 259. Type I collagen molecules incorporating pro-a2(I) chains with cysteine at position 646 melted 0.5 "C below control type I collagen, and those incorporating pro-a2(I) chains with cysteine a t position 259 melted 2.0 "C below control.
T, values for type I collagen vary slightly from laboratory to laboratory because of differences in experimental procedures, and T, values of control and 01 collagens should only be interpreted as relative values. T o improve reproducibility, we programed a thermal cycler to elevate incubation temperature by 1 "C every 12 min; although this produces the same overall temperature elevation rate as was described by Bruckner and Prockop (23), the shape of the temperature curve more closely approximates a series of plateaus than the more continuous elevation generated by a circulating water bath, the method used in previous melting point experiments (12) on type I collagens from 01 cell strains. A more important difference is that for the purpose of comparison between proline-and cysteine-labeled molecules, T, curves in this report only reflect protection of a2(I) chains incorporated into type I collagen molecules. Virtually all melting point studies based on the method of Bruckner and Prockop (23) show loss of protection of a2(I) chains at a lower temperature than that of al(1) chains, as does our analysis.
Since the type I collagen triple helix winds from its carboxyl-to amino-terminal end, molecules incorporating pro-a chains with substitutions for glycine in the triple helical domain probably assemble normally carboxyl-terminal to the substitution. This model is supported by the consistent finding in many 01 cell strains that excess post-translational modifications of abnormal type I collagen molecules occur amino-terminal to the substitution (reviewed in Ref. 1). Either propagation of the triple helix is slowed at the substitution or else the structure of the triple helix amino-terminal to the substitution is abnormal in a way that permits continued post-translational modification. Thus, mutations closest to the carboxyl terminus would alter a greater portion of the triple helix.
Starman et al. (20) proposed that in individuals heterozygous for cysteine for glycine substitutions in the al(1) chain, severity of the 0 1 clinical phenotype correlates with the position of the substitution along the chain, with the mildest phenotypes resulting from mutations near the amino-terminal end of the triple helix. Arginine for glycine substitutions in al(1) have a similar position effect, except that the lethal phenotype is seen with more amino-terminal mutations (391 uersus 691 for cysteine) (12,14,22). However, the more severe 01 phenotype associated with cysteine a t position 259 in a2(I) chains compared to that associated with cysteine a t position 646 suggests that for substitutions in a2(I) chains, the position effect may differ from that in al(1) or else it may be overridden by disruption of important domains in the triple helix.
Substitutions for glycine residues in a chains of type I collagen can serve as probes for determinants of triple helical stability by examining the decrease in T, (the temperature of helix to coil transition in type I collagen molecules containing one or more mutant chains).
All but two substitutions for triple helical glycine residues in 0 1 cell strains have resulted in decreased thermal stability of type I collagen molecules. One exception was a substitution of arginine for glycine a t position 1012 in the a2(I) chain, the carboxyl-terminal glycine of the triple helix (15). The other was a serine al(1) glycine at position 844; the limited effect of the latter substitution on thermal stability may be because a serine substituted for glycine results in only minor alterations in the structure of the helix (21).
This study supports a model of the type I collagen triple helix in which some regions contribute to triple helical stabil- in the type I collagen triple helix. There are 1014 residues/chain in the triple helix; reference numbering of residues in each chain begins at the amino-terminal end of the triple helix. During molecular assembly, the triple helix winds from carboxyl-to amino-terminal end. Lower, residues immediately adjacent and amino-terminal to cysteine substitutions (asterisks) a t positions 646 (families A and B) and 259 (family C) in contiguous al(1) and a2(I) chains of a type I collagen molecule (28, 35). Amino-terminal to position 646 but not position 259 is an abundance of G-P-P and G-P-A triplets (boxed residues) that may provide stability to triple helical structure during ossembly of a type I collagen molecule incorporating an a2(1) chain with a cysteine for glycine substitution (33,34). ity more than others. Increasing proline content confers stability to the collagen triple helix. The triple helix is further stabilized by hydroxylation of prolyl residues in the Y-position (reviewed in Ref. 33). Proline or hydroxyproline occupies 20 of the 63 residues (seven triplets/chain) in the triple helix immediately amino-terminal or distal to the cysteine for glycine substitution a t position 646 (29,36); 9 of these 20 prolyl residues are in the Y-position and are probably hydroxylated (Fig. 4). In addition, 10 of the 21 Gly-X-Y triplets in this region are either Gly-Pro-Hyp or Gly-Pro-Ala, both of which have been shown to produce the most stable helical structures (34,35). The presence of these stabilizing features may limit the disruption of the helix that would be expected aminoterminal to the glycine substitution a t position 646. In contrast, there are 16 prolines in the 63 residues amino-terminal to position 259, of which only 5 are in the Y-position; only 1 of those 21 Gly-X-Y triplets is Gly-Pro-Hyp, and none are Gly-Pro-Ala.
The effects on thermal stability of substitutions of cysteine for glycine in the a2(I) chain differ from those in the al(1) chain. As a general rule, as substitutions in the al(1) chain move from the carboxyl-terminal end toward the amino terminus, the thermal stability of molecules that contain one or two abnormal chains increases toward normal. Molecules that contain two abnormal chains typically have higher denaturation temperatures than those with one, presumably because the interchain, intramolecular disulfide bond serves to stabilize the molecule. In contrast, there is no similar gradient for substitutions of cysteine for glycine in the a2(I) chain. Substitutions in the a2(I) chain result in two classes of molecules, those that contain the abnormal chain and those that are normal; thus, there is no opportunity to modify the effect of a mutation by stabilizing molecules through intrachain disulfide bonds. As a result, mutations in the COL1A2 gene may provide clearer insight than those in the COLlAl gene into the domain structure of the triple helix and identify those regions which can be destabilized by point mutations.