Calcium binding, hydroxylation, and glycosylation of the precursor epidermal growth factor-like domains of fibrillin-1, the Marfan gene protein.

The extracellular matrix protein fibrillin-1 is a major component of elastic microfibrils, which are complex assemblies of several proteins and are found in most connective tissues, frequently associated with elastin. Fibrillin-1 contains 43 precursor epidermal growth factor-like (pEGF) domains that have a consensus sequence for calcium binding. The calcium binding potential of a fibrillin-1 pepsin fragment (PF2) was quantitatively analyzed using microvolume equilibrium dialysis. Peptide sequence data and pepsin fragment size determination indicate that PF2 contains seven pEGF domains, each with the calcium binding consensus sequence. Scatchard plot analysis of the calcium binding data shows that PF2 has six to seven high affinity binding sites with a Kd = 250 microM at pH 7.5. There is a second overlapping consensus sequence in the pEGF domains for beta-hydroxylation of a specific Asp/Asn residue. Five partially hydroxylated Asn residues have been identified by protein sequence analysis of fibrillin-1 fragments. This is the first demonstration of this modification in a connective tissue protein. The calcium binding consensus sequence also contains a conserved Ser residue with an apparently novel modification, which causes the Ser residue to behave like an Asp residue during protein sequencing. Marfan syndrome, a heritable disorder of connective tissue, is known to be associated with mutations in the FBN1 gene. Most of these mutations have been found in pEGF domains, frequently substituting Cys for another amino acid, destroying the pEGF motif secondary structure along with its calcium binding potential. Other mutations cause the substitution of single amino acids in the calcium binding consensus sequence, which could affect calcium binding but also the hydroxylation of Asp/Asn residues or the modification of Ser residues.

The extracellular matrix protein fibrillin-1 is a major component of elastic microfibrils, which are complex assemblies of several proteins and are found in most connective tissues, frequently associated with elastin. Fibrillin-1 contains 43 precursor epidermal growth factor-like (pEGF) domains that have a consensus sequence for calcium binding. The calcium binding potential of a fibrillin-1 pepsin fragment (PF2) was quantitatively analyzed using microvolume equilibrium dialysis. Peptide sequence data and pepsin fragment size determination indicate that PF2 contains seven pEGF domains, each with the calcium binding consensus sequence. Scatchard plot analysis of the calcium binding data shows that PF2 has six to seven high affinity binding sites with a Kd = 250 p at p H 7.5. There is a second overlapping consensus sequence in the pEGF domains for p-hydroxylation of a specific AsplAsn residue. Five partially hydroxylated Asn residues have been identified by protein sequence analysis of fibrillin-1 fragments. This is the first demonstration of this modification in a connective tissue protein. The calcium binding consensus sequence also contains a conserved Ser residue with an apparently novel modification, which causes the Ser residue to behave like an Asp residue during protein sequencing. Marfan syndrome, a heritable disorder of connective tissue, is known to be associated with mutations in the FBNl gene. Most of these mutations have been found in pEGF domains, frequently substituting Cys for another amino acid, destroying the pEGF motif secondary structure along with its calcium binding potential. Other mutations cause the substitution of single amino acids in the calcium binding consensus sequence, which could affect calcium binding but also the hydroxylation of Asplksn residues or the modification of Ser residues.
Fibrillin-1 is a 350-kDa, single-chain, non-collagenous, extracellular matrix glycoprotein (1). In tissues it is a major component of elastic microfibrils, which are insoluble macromolecular structures made up of a complex assembly of several proteins (2)(3)(4)(5)(6). Elastic microfibrils are frequently associated with amorphous cores of elastin but are also found in bundles independently of elastin, for example in ciliary zonules (7) and * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18  skin (8). The functions of the microfibrils are unclear but include being a network onto which amorphous elastin is deposited during development (9).
Mutations in the fibrillin-1 gene give rise to the dominantly inherited connective tissue disease known as Marfan syndrome. Fibrillin-1 cDNA has been cloned and sequenced and a number of mutations causing Marfan syndrome identified (10). The derived amino acid sequence of fibrillin-1 has a complex domain structure revealed by its similarities with other proteins (11)(12)(13)(14). A major portion of the protein structure is composed of tandemly repeated domains, which constitute approximately 65% of the sequence of the molecule and are identified by their likeness to repeats found in precursor epidermal growth factor (pEGF).' These are sequences of about 40 amino acids that contain 6 cysteines, which fold into a characteristic antiparallel /3 sheet structure (15,16). They are also found in a variety of functionally unrelated proteins in organisms ranging from Drosophila and Caenorhabditis elegans to humans The pEGF domains of many proteins such as blood coagulation factors VII, IX, and X, protein S, and protein C also contain consensus sequences for calcium binding and asparagine hydroxylation (21). Of the 46 pEGF domains in fibrillin-1, 43 contain predicted calcium binding and /3-hydroxylation sites (13,14). The binding of calcium to a Western blot of fibrillin also suggested that fibrillin-1 pEGF domains may bind calcium (14). There have been several mutations reported that disrupt the disulfide bonding of specific pEGF domains and presumably their ability to bind calcium (22)(23)(24)(25)(26). Other mutations that alter single amino acids in the calcium binding consensus sequence could also affect calcium binding (25,(27)(28)(29). Two unusual 0-glycosidic modifications of Ser/Thr residues are also found in EGF motifs of multidomain proteins. The serine residue in the consensus sequence -Cys-Xaa-Ser-Xaa-Pro-Cys-is modified with (Xylal-3)Xylal-3Glcpl-and in -Cys-Xaa-Xaa-Gly-Gly-Thr/Ser-Cyswith Fucal-in several proteins (30). In factor M both of these glycosylated Ser residues and hydroxyasparagine are present within a 12-amino acid sequence (31). The functions of these modifications is unknown, and no relationship between them has been discovered.
Overall, the structural and functional roles of fibrillin-1 have remained rather poorly defined, mostly due to the fact that it is not possible to isolate native protein in useful amounts. HOWever, large pepsin fragments have been isolated and well characterized, providing a resource for investigating putative properties (32). Data presented here quantitate the calcium binding tor; EGF, epidermal growth factor; PTH, phenylthiohydantoin; 8CYS, The abbreviations used are: pEGF, precursor epidermal growth facdomains containing 8 cysteines. properties of fibrillin-1 pepsin fragment PF2. Partial hydroxylation of a conserved asparagine in some pEGF domains is demonstrated, and evidence for glycosylation of a conserved serine residue in a new consensus sequence is presented.

MATERIALS AND METHODS
Preparation of the Fibrillin-1 Fragment PF2"Human amnion was solubilized using pepsin, and the resulting pepsin fragments PF1, PF2, and PF3 were isolated and purified as described earlier (32). The purified PF2 fragment migrated as a single band in SDS-polyacrylamide gel electrophoresis in both the reduced and unreduced states and had an apparent molecular weight of approximately 50,000. Amino acid sequence analysis determined its location within the complete sequence of fibrillin-1 (11) and showed that it is composed of seven tandem, complete pEGF domains with parts of 8CYS domains at its amino and carboxyl terminus (Fig. 1).
Equilibrium Dialysis-Calcium binding studies were carried out using an eight-chamber microvolume equilibrium dialysis instrument (Hoefer Scientific Instruments). The dialysis was performed using a sample volume of 100 pl and a dialysis membrane (EMD 103) with a M, cutoff of 12,000-14,000. All calcium-free buffers were treated with Chelex 100 (Bio-Rad) to remove any traces of heavy metal ions. The dialysis membrane was prepared according to manufacturer's instructions using Chelex-treated solutions. A stock sample solution was prepared by dissolving fibrillin-1 fragment PF2 in Tris-HC1, pH 7.5, 0.17 M NaCl at a concentration of 4 mg/ml. A 1:4 dilution of the sample was made with sample buffer, and 100 pl of this was placed on one side of the dialysis membrane. Various concentrations of cold calcium (from 0 to 2000 p~) in 100 pl of buffer containing 50 nCi of 45Ca were placed on the other side of the membrane. Equilibrium was attained after 16 h a t room temperature. The concentration of calcium on both sides of the membrane was calculated from the cpm of 45Ca measured in 60-1.11 aliquots taken from each side of the membrane. The aliquots were added to 10 ml of ScintiVerse (Fisher Scientific) and counted for 15 min in a scintillation counter (Beckmann LS 6000SE) in the I4C window.
Competition Experiments with Magnesium a n d Zinc-A cold calcium concentration of 300 1 " was used and chloride salts of either magnesium or zinc added to 3 mM final concentration. 50 nCi of 45Ca were added and the dialysis carried out as described above.
Amino Acid Analysis-The pic0 Tag system (Waters) was used with minor modifications (33).
Sequencing-Peptide sequences were determined using a gas-phase sequencer (Applied Biosystems, model 470A) with an on-line PTH-derivative analyzer (Applied Biosystems, model 120). The procedures used were those described in the manufacturer's manuals. The cDNA sequences were taken from Maslen et al. (11).
Data Analysis-Calcium binding data was analyzed by doing a Poisson weighted first degree regression fit of (moles of bound calciudmole of protein uersus nanomoles of free calcium). From the regression coefficients, the number of binding sites was calculated to be 6.3. An estimated uncertainty of 0.86 in this value was found by propagating the coefficient standard errors as determined from the regression analysis.

RESULTS
The pepsin fragment PF2 was characterized as described previously. Amino acid sequencing of its amino terminus demonstrated its purity, and sequencing of tryptic and V8 protease digests defined its length (32). Although the exact carboxylterminal sequence was not defined, it was localized from the sequence data to an 8CYS domain (11). This was sufficient to show that the peptide has seven intact pEGF-like domains, each of which contains a consensus sequence for calcium binding. The position of the fragment within the complete fibrillin-1 structure is shown in Fig. 1.

Modifications of Fibrillin-1 pEGF Domains
A calcium binding curve for PF2, plotted from the equilibrium dialysis experiments, is shown in Fig. 2A. Binding increased steadily with increasing calcium concentration, giving no indication of widely differing affinities of the many pEGFlike domains for calcium. The Scatchard plot (Fig. 2 B ) indicated the presence of seven binding sites in PF2 with an average dissociation constant of 250 p~. A 10-fold molar excess of magnesium and zinc ions over calcium ions did not decrease calcium binding.
Comparing the protein sequence of PF2 with the cDNA sequence revealed several serine residues that were conserved in each of the pEGF-like domains that were consistently misidentified as an aspartic acid in the protein sequences. Fig. 3 shows the PTH-derivative analysis from seven steps of the sequence around one such serine residue from pEGF29 (Fig. 4). The position of a true serine is indicated in step 20, and the observed peak, which eluted in the position of aspartic acid, is labeled (Dl. The serine must therefore be post-translationally modified in some way such that the elution position of its PTHderivative is shifted to that of aspartic acid.
Protein sequencing also revealed hydroxylation of specific asparagine residues in some but not all pEGF-like domains.
Step 16 in Fig. 3 shows a PTH-derivative analyses from the sequencing of a V8 protease peptide in which asparagine is hydroxylated. The erythro-hydroxyasparagine derivative elutes just before asparagine as described previously (34). The extent of hydroxylation of a given residue was approximately 50%, but not all asparagine residues in this position of each pEGF-like domain were hydroxylated. Fig. 4 summarizes the post-translational modification status of the AspfAsn and Ser residues in PF2.

DISCUSSION
Pepsin fragment PF2 is an important resource for investigating the structural and functional properties of fibrillin-1. The use of a fragment complete with the complicated array of post-translational modifications found in fibrillin-1 ensures that the studies reflect the true nature of the native protein.
Eventual comparison of recombinant PF2 with extracted PF2 will provide insight as to the function of the post-translational modifications. In addition, the study of tandemly linked pEGF domains provides a more accurate representation of their physiological conformation and, therefore, affinity for calcium than does the study of an isolated individual domain (35,36).
The quantitative analysis of calcium binding to PF2 demonstrates that this fragment contains six to seven high affinity calcium binding sites with a Kd of 250 p~ at pH 7.5, in good agreement with values determined for factor I X (37) and factor X (38, 39). The high affinity of the calcium binding sites indicates that, in tissues, fibrillin-1 will be totally loaded with calcium at all times (37). Tightly bound calcium may play a role in stabilizing the tertiary structure of fibrillin-1 (Fig. 5). The exact binding ligands for calcium binding in the fibrillin-1 pEGF domains are unknown. From mutation and NMR studies, primarily of isolated factor XI pEGF domains (35,40), it would appear that at least Asp-47, AspIAsn-49, Glu-50, Asp1 Asn-64, and positions 48 and 65 are involved, with Asp, Glu, and Asn providing side chain carbonyl oxygens and positions 48 and 65 backbone carbonyl oxygens (Fig. 5). In the multiple tandem repeat of pEGF motifs, the distance between Cys-82 in one motif and Cys-51 of the following motif is constant at 5 amino acid residues. This region, which contains 4 of the 7 expected amino acid residues involved in calcium binding, is not disulfide-bonded and so acts as a hinge between the two pEGF motifs. However, if these residues bind a calcium ion, the hinge will be locked into place. It is also possible that other residues between Cys-73 and Cys-82 may also be ligands for calcium, further stabilizing the geometry of the intermotif regions. The general effect of calcium binding on fibrillin-1 would probably be to decrease the flexibility of this rod shaped molecule and perhaps facilitate its incorporation into the microfibril and stabilize the microfibrillar structure. In support of this, it has been shown that chelation of divalent cations from extracted microfibrils, isolated from cultured dermal fibroblasts, resulted in a reversible gross disruption of microfibril morphology (41).
all pEGF domains in PF2 bind one calcium ion and at least one domain does not contain post-translational modifications, the hydroxylation of asparagine and derivatization of serine are not required for calcium binding. The lack of a requirement for hydroxyasparagine in calcium binding is in agreement with studies of factor M expressed in yeast, which was not post-translationally modified but nevertheless bound calcium with no significant change in affinity (35,37). Structural studies of a pEGF domain with bound calcium also indicated that the hydroxyl group is facing away from the calcium (40).
It has been proposed that disruption of single calcium bind- Most PF2 peptides have been completely sequenced, identifying a pattern of modification to the tandemly repeated pEGF domains in that region (Fig. 4). All 43 pEGF domains of fibrillin-1 contain the consensus sequence CX(D/NWXX(F/Y)XCXC for P-hydroxylation of asparaginelaspartic acid (42). Within that consensus sequence, there is an invariant asparagine residue at domain position 64 (Fig. 51, except in pEGF28, which has an aspartic acid residue in that position (also conserved in fibrillin-2). Peptide sequence analysis of fragments from PF2 shows partial hydroxylation of that asparaginelaspartic acid, indicating that this is a regulated modification. Fibrillin-1 is the first connective tissue protein known to carry this particular post-translational modification.
The unusual serine modification has not been previously described, and its precise nature is unknown, although from preliminary carbohydrate analyses it does appear to contain xylose. One other xylose-based modification of serine has been described at a conserved site in EGF-like domains of other proteins, but it is in a position different from that of the serine residues modified in fibrillin-1 (30). Xylose-based modifications are extremely rare in mammalian glycoproteins and their functional role is unknown, although protein-protein recognition has been proposed (43).
Hydroxyasparagine and serine modifications always occur coincidentally within a given pEGF domain in PF2. It is not known if the corresponding appearance of the two modifications is significant, but it is possible that the two events are co-regulated. The complexity of PF2 is further increased by the presence of an N-linked carbohydrate in pEGF28. During protein sequencing, no amino acid was detected in this position, which is a good indication of glycosylation of asparagine in an NXT sequence. As ing sites by mutations in pEGF domains is a predominant mechanism in the pathogenesis of Marfan syndrome (23,29). However, there is currently no direct evidence that disruption of calcium binding plays a role in the manifestation of the Marfan syndrome phenotype. The overlap between the calcium binding consensus sequence and the consensus sequences for glycosylation and hydroxylation make it unlikely that a mutation in those regions would only disrupt calcium binding without altering other properties as well. Consequently, it may not always be accurate to attribute the affects of a mutation solely to a disturbance of calcium binding.