‘H NMR-based Determination of the Three-dimensional Structure of the Human Plasma Fibronectin Fragment Containing Inter-chain Disulfide Bonds*

Human plasma fibronectin is a plasma glycoprotein that plays an important role in many biological proc- esses. It consists of two identical 230-250-kDa subunits that are joined by two disulfide bonds near their carboxyl termini. Each subunit contains various binding domains composed of three types of homologous repeats. Recent work has determined the three-dimen- sional structures of various repeat fragments, but little is known about the three-dimensional structure of the carboxyl-terminal region. A recent NMR study of a plasmin-digested carboxyl-terminal inter-chain disul- fide-linked heptapeptide dimer has proposed that the two subunits are arranged in an antiparallel fashion (An et al. (1992) Biochemistry 31, 9927-9933). We have now determined the three-dimensional structure for a substantial portion of a trypsin-digested interchain disulfide-linked 52-residue (6 kDa) fragment of the carboxyl-terminal of human plasma fibronectin (which includes the above-mentioned heptapeptide di- mer) using two-dimensional NMR methods and a new strategy for NMR-based protein structure determination. The NMR data requires that the gradient min- imization. A macroscopic dielectric constant of 10 was used for these calculations. Calculations using dielectric constants of and showed that conclusions based on energy considerations were unaffected for dielectric constants greater than Although the same starting values were used for the dihedral angles of a particular residue in the two chains, symmetry was not used explicitly as a constraint in the energy minimization process. In general, distance constraints also were not included explicitly in the calculations, except where specifically stated.

the Kringle structure derived from the crystal structure of the prothrombin Kringle 1 unit (Holland et al., 1987;Constantine et al., 1992). The solution structure of the tenth type I11 repeat (94 residues) has been determined by 2D and 3D NMR (Baron et al., 1992), and the crystal structure of a Fn type I11 domain from tenascin, an extracellular matrix protein has been determined by x-ray crystallography (Leahy et al., 1992); both structures consist of two antiparallel @-sheets with an immunoglobulin-like fold.
The two subunits of Fn are joined near their carboxyl termini by two disulfide bonds. Despite recent progress in obtaining detailed information concerning the primary sequence and gene structure of Fn, and the three-dimensional structures of the various repeat fragments, comparatively little is known about the three-dimensional structure of the carboxyl-terminal region. In particular, the spatial arrangement of the subunits about these disulfide bonds, i.e. whether the monomeric chains are arranged in a parallel or an antiparallel fashion in the dimer, has not been conclusively determined. A recent report, based on analysis of 2D NMR data for a 14-residue fragment of the carboxyl-terminal region, has suggested that the two subunits are arranged in an antiparallel fashion, and has proposed two alternative three-dimensional structures for aqueous and dimethyl sulfoxide environments. There are, however, problems with the analysis and resulting structures presented in that work that are discussed in more detail in the discussion section below.
In this work, we have purified the carboxyl-terminal 6-kDa Fn fragment containing two 26-residue fragments with interchain disulfide bonds, and have determined the three-dimensional structure of the disulfide linked region by 2D NMR methods. We report NMR assignments for 25 of the 26 residues in the monomer of the 6-kDa carboxyl-terminal fragment. The NOE data and a new strategy for NMR-based protein structure determination have been used to build the three-dimensional structure of the Thr3-Pro14 segment containing the interchain disulfide bonds. We show that the NMR data are consistent with only one set of structures: those in which the two interchain disulfide bonds linking the monomers of the Fn molecule are arranged in an antiparallel fashion. The resulting structure for the disulfide-linked region of our 52-residue fragment differs substantially from those reported by  for their 14-residue fragment. We suggest that the highly truncated heptapeptide dimer may not retain its native conformation in either aqueous or dimethyl sulfoxide solution.

EXPERIMENTAL PROCEDURES
Sample Preparation-Fn was purified from fresh-frozen human plasma, obtained from the Blood Center of Southeastern Wisconsin, on a Sepharose 4B column and a gelatin-Sepharose 4B column, arranged in tandem (Engvall and Ruoslahti, 1977). The integrity and purity of the protein were routinely examined with sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Purification of the carboxyl-terminal 6-kDa fragment was performed essentially as described by Garcia-Pardo et al. (1984) with some modifications. Fn (typically 60 mg) was incubated with trypsin at a 33001 (w/w) ratio of Fn:trypsin for 30 min. Digestion was stopped by adding phenylmethylsulfonyl fluoride to a final concentration of 0.1 mM and soybean trypsin inhibitor at a 2:l ratio of inhibitor:trypsin. The material was then applied to the following columns (2.5 X 8 cm, approximately 40 ml, flow rate of 50 ml/h) arranged in tandem: gelatin-Sepharose, heparin-Sepharose, and DE52. The final column was then washed with 10 mM Tris (pH 7.4), and 75 mM NaCl. The 6-kDa fragment, along with the 34-kDa fragment, was eluted with 10 mM Tris at pH 7.4, containing 0.5 M NaCl. The fractions were pooled, dialyzed, concentrated by partial lyophilization, and loaded onto a Sephadex G-50 column (1.5 X 95 cm, approximately 170 ml, flow rate of 8 ml/ h) for final purification. The 6-kDa fragment from this step was free of higher molecular weight fragments as judged by SDS-PAGE (silverstained, 15% acrylamide gel). The typical yield was about 0.5 mg of the 6-kDa fragment per 60 mg of plasma Fn. The amino acid composition and sequence of the 6-kDa fragment were determined and found to be identical with the 26 carboxyl-terminal residues of plasma Fn (Peterson et al., 1983). For NMR measurements, about 3 mg of the 6-kDa fragment was lyophilized to dryness and redissolved in 0.5 ml of either D20, or HzO containing 5% D20, giving a final peptide concentration of about 1 mM.
'H NMR Spectroscopy-NMR data were accumulated on a General Electric GN500 spectrometer equipped with a 1280 Nicolet computer. Phase-sensitive two-dimensional (2D) COSY and NOESY data sets were collected in the hypercomplex mode (States et al., 19821, with standard pulse sequences and phase expressions (Jeener et al., 1979, Wider et al., 1984. NOESY data were acquired with mixing times of 125 and 250 ms. Relayed COSY experiments in the absolute value mode (Wagner, 1983) were used to help identify spin systems of side chains. All 2D data sets consisted of 1024 complex points in the tS dimension and either 256 or 320 complex points in the tl dimension. All data processing was done on a Silicon Graphics IRIS computer, using the FTNMR software (Hare Inc., Woodinville, WA). Digital filtering was used prior to Fourier transformation in every case. All NOESY spectra were base-line-corrected by a fifth-order polynomial. Chemical shifts are referenced to the water signal, which is 4.79 ppm from 4,4'-dimethy1-4-silapentane-I-sulfonate at 25 "C.
Computational Methods-Two different approaches were examined for the determination of spatial structure from NMR data: (i) the deterministic distance geometry (DG) approach (Wuthrich, 1989), followed by energy refinement; and (ii) a build-up strategy (BUILD), using a probabilistic model of protein conformation (Sherman et al., 1987(Sherman et al., ,1988Sherman and Johnson, 1993).
The DSPACE software package (Hare Research, Inc., Woodinville, WA) was used in the DG approach, to generate structures consistent with covalency constraints and semiquantitative estimates of interproton distance constraints, starting with random initial atomic coordinates (Nerdal et al., 1988). 2.0 A was used as a lower limit of distance constraints, and 2.6, 3.3, and 4.0 A were used as upper limits for qualitatively observed strong, medium, and weak cross-peak intensities, based on the estimation of distances from the experimentally observed volume integrals for a sampling of NOE cross-peaks (using a proportionality constant derived from the volume integrals of the cross-peaks relating 6H and CH of PheL2 across a distance of 2.5 A). Structure refinement was obtained by randomization of COordinates followed by a cycle of simulated annealing and conjugate gradient minimization of penalty functions. All atoms in the segment considered were subjected to simulated annealing. Similarly, all atoms were considered in the calculations of penalty function and gradient. Since the NMR data indicate a symmetrical structure, symmetry was used as a constraint in the energy refinement process. Additional structures were generated by random embeds, and the energy refinement and simulated annealing algorithms were repeated to minimize the constraint violations for the new structures. A set of structures, generated by DSPACE, and selected qualitatively for their conformational diversity by comparison of ($,#) plots, was also used as starting structures for restrained energy minimization (DGREM) and restrained molecular dynamics (DGRMD) calculations using the CHAR" software package, with empirical energy potentials taken from Brooks et al. (1983). All calculations were performed on a Silicon Graphics work station.
The BUILD approach uses, in addition to NMR data, a priori information on empirical distribution functions for backbone conformations, generated from the high resolution x-ray structures in the Protein Data Bank. NMR information regarding the presence or absence of sequential d connectivities (NOE cross-peaks among nearest neighbor residues) is used for statistical prediction of backbone conformations of individual amino acid residues. A three-step procedure was followed (i) estimation of a starting set of angular coordinates (local conformations) from the NMR data (Sherman et al., 1987); (ii) determination of the spatial structure by a gradual buildup process (Sherman et al., 1988); (iii) structure refinement by energy minimization on unconstrained structures. The FISINOE program (Sherman and Johnson, 1992) was used to estimate the @,$ values with corresponding probabilities for each residue, given the d a~, dNN, and d #~ connectivities. The most probable $,# values for each residue were used as the starting set of angular coordinates. The BUILD strategy to obtain a spatial structure from the starting set of backbone conformations utilizes an optimality principle in which the fragment considered at any stage has a minimum number of residues and a maximum number of restrictions. Long-range NOE requirements and the value of conformational energy were used as steering parameters to guide the BUILD process. Two general assumptions were used: (i) t,he upper limit of distance constraints is 3.3 A for sequential d connectivities, and 4 A for long-range NOEs; and (ii) all dihedral angles must lie within sterically allowed regions in conformational space. For the sake of convenience, all NOEs other than sequential are termed long-range. The interactive graphics package INSIGHT (Biosym Technologies, San Diego, CA) was used to construct the starting structures, with backbone conformations estimated using FISINOE. All energy minimizations were performed using CHAR" on a Silicon Graphics work station. The force constant for the dihedral constraints was reduced in gradual steps from 50.0 (kcal. mol".rad-*) to 2.5 and finally to 0.0, decreasing it by about a factor of two following each cycle of 250 steps of conjugate gradient minimization. A macroscopic dielectric constant of 10 was used for these calculations. Calculations using dielectric constants of 1, 2, 4, 10, 30, and 80 showed that conclusions based on energy considerations were unaffected for dielectric constants greater than 4. Although the same starting values were used for the dihedral angles of a particular residue in the two chains, symmetry was not used explicitly as a constraint in the energy minimization process. In general, distance constraints also were not included explicitly in the calculations, except where specifically stated.
The Thr3 to Arg5 segment was readily assigned by a combination of 2D NOESY, COSY, and RELAY COSY experiments in D20 and HzO. Resonances for G1uZ6 and the amide resonances for Thr' and Asn' were not readily observable. The threonine and valine side chains were identified by a relay experiment in DzO that showed the cross-peak between the C,-proton and the y-methyl protons, and also by their characteristic spin patterns and NOEs in the COSY and NOESY spectra, respectively. The asparagine side chains were identified by their chemical shifts in the COSY, and, in the case of Asn' and Am6, by NOEs between the 0-methylene protons and the y-NH2 group (Fig. 1). The two prolines (Pro' and Pro14) were identified by their characteristic cross-peak patterns in both COSY and NOESY spectra (illustrated for Pro' in Fig. 2). The lone isoleucine (Ile') was readily identified because of its unique spin system (Fig. 2). Similarly, Phe12 was easily assigned, being the only aromatic residue in the fragment. The ortho, meta, and para protons of Phe" were traced out in a COSY spectrum, and the @-methylene protons were identified by their strong NOEs to the aromatic protons ( Figs. 1 and 3). Most of the peptide backbone of the Thr3-Arg5 segment could be traced by &N connectivities in a 250-  ms NOESY spectrum in HzO (illustrated for the Thr3-Met13 segment in Fig. 1). The two sections on either side of Pro' (and Pro") could be connected by NOEs relating the protons on the proline ring to protons of the preceding and following residues. A strong NOE was observed between the a-H of Cys7 (and Met13) and a 6-H of Proa (and Pro1') (illustrated for Cys7-ProS in Fig. 2), indicating a trans peptide bond for Cys7-Pros (and Met13-Pro14). The a-H of Pro' (and was connected by NOE to the amide proton of Ile9 (and Led5) (illustrated for Proa-Ileg in Fig. 1). Intraresidual NOES were observed between all amide protons and their own CB-protons, except for residues Aspz3 and SerZ4 in the Thr3-Arg5 segment. Several important long-range NOEs were observed in the 125ms NOESY spectrum in DzO (Fig. 3). A strong NOE was observed between the Phe" 6-H and the Ile9 a-H. More importantly, the t-and {"protons of Phe" showed definite NOEs to both @-methylene protons of Cys7 and Cys". Also, one of the 0-methylene protons of Cys7 showed NOEs to both 0-methylene protons of Cys".

-
Chemical shifts of assigned resonances are shown in Table I. Relevant NOE data are summarized in Table 11. Only a few minor NOE cross-peaks remain unassigned, some of which may be due to impurities in the sample.
Data Interpretation-An equilibrium situation with multiple conformations is typically encountered with short linear peptides in solution. However, the 6-kDa fragment of Fn, a dimer containing 52 residues, is roughly the size of bovine pancreatic trypsin inhibitor. The size of this fragment, the fact that it is a dimer, and the presence of two interchain disulfide bonds, are expected to lend some conformational rigidity to its structure in solution. Comparison of 'H-chemical shifts (TabIe I) of the Thr' to Ar? segment with those of random coil structures (Bundi and Wuthrich, 1979) shows that several resonances within the Thr3-Pro" sequence have chemical shifts sufficiently different from random coil structures to indicate definite conformational preference in this segment containing the inter-chain disulfide bonds. Consistent with this observation, examination of Table I1 shows that inter-residue NOEs other than those showing daN, dm, dbN  connectivities (Table 11, column 6) are observed only within this segment. For proline-containing peptides, the presence of multiple conformers is often indicated by the observation of two distinct sets of resonances corresponding to cis-and trans-prolines, since the rate of exchange between the species containing the two isomers is slow on the NMR time scale (Wuthrich, 1976;Larive et al., 1992). Resonances corresponding to only the trans conformers were observed for both Pro8 and in the 6-kDa Fn fragment, confirmed by the presence of strong NOE cross-peaks for both prolines. Based on these observations, it was assumed that a substantial population of the conformers of the 6-kDa fragment in solution had a preferred conformation for the Thr3-Pro14 segment, the structure being more flexible further away from the interchain disulfide bonds. Calculation of the three-dimensional structure, therefore, was attempted only for the Thr3-Pro14 segment. This was found to be more than adequate to demonstrate that the two interchain disulfide bonds linked the Fn monomers in an antiparallel fashion in the 6-kDa carboxyl-terminal dimer fragment.
The NMR information indicates a symmetrical structure: chemical shifts are identical for the same residue in both chains. This potentially complicates the interpretation of NOE data, since no distinction can be made between intrachain and interchain NOES. Hence, a cross-peak relating residues i and i+l through space, might not necessarily indi- ' Intra-residual NOE: dNg (i,i); + denotes NOE to the single @ proton for these residues; ++ denotes NOE to both @ protons.
'Sequential connectivities: d(i,i+l). + and blank spaces, respectively, indicate the presence and absence of NOES, -denotes the absence of a proton in the residue (e.g. NH in Pro), such that the particular NOE cannot be present. "?" indicates the absence of NOE information (e.g. due to bleaching of cross-peaks caused by water suppression using presaturation). In such cases, both possibilities (presence and absence of the corresponding d connectivity) were considered, and two regions in 4,$ space were predicted by FISINOE (see Table 3). e dw(i,j) are not repeated again as d-(j,i). While these NOES are represented here as intrachain only (i.e. involving residues (i,j)), the data does not exclude interchain NOES (i.e. those involving residues (i,j')); both possibilities are considered in the analysis (see text). Subscripts (@',p") and (b',b") have been used to represent the two Cgand CQ-protons (column 6) differentiating them in terms of their chemical shifts only (see Table l), and do not represent stereospecific assignments, which can be obtained from the xI,x2 values in Table   111. cate a sequential connectivity, and could, in principle, represent a long-range NOE cross-peak relating residues i on chain 1 and (i+1)' on chain 2 of the dimer. However, a statistical analysis of short proton-proton distances in a collection of data from high resolution protein crystal structures (Billete? et al., 1982) shows that 88% of the aH-NH distances 5 3.0 A represent sequential connectivities. The corresponding probabilities for NH-NH and pH-NH are 88 and 76%, respectively. The probabilities are even higher when such short interproton distance limits are imposed simultaneously: the probaplity that the copnectivities are sequential when ( d a N 2 3.6 A and 3.4 A) and 90% when ( d N N 5 3.0 A and d g N 5 3.0 A). In other words, protein folding patterns in nature very rarely lead to short proton-proton distances relating main chain and Cgprotons of residues that are not immediate neighbors on a polypeptide chain. This conclusion was applied to the NOE data from the 6-kDa fragment, although the uniqueness of the interchain disulfide bonds makes it an unusual case. A conservative estimate of the upper limit for the observed d a N , dNN, dgN connectivities is 3.3 A, based on the estimation of distances from the measured volume integrals for a sampling d N N 3.0 A) is 99%; it is 95% when ( d a N 5 3.6 A and d o N S: Tibronectin C-terminal Fragment of aH-NH, NH-NH, and pH-NH NOE cross-peaks (using a proportionality constant derived from the volume integrals of the crqss-peaks relating 6H and tH of Phe" across a distance of 2.5 A). Hence, the simplest interpretation of the NOE data in Table 11, is that all NOEs showing connectivities between main chain protons, and between main chain and Cg-protons (i.e. d u N , d N N , d g N ) represent intrachain sequential connectivities. The statistical analysis noted above indicates that this should constitute a fairly accurate representation of the threedimensional structure. This interpretation was used initially in both the deterministic and the probabilistic analyses. Because of the observed chemical shift symmetry, the set of sequential NOE connectivities were assumed to apply equally to both chains in the dimer. All d N g connectivities were assumed to be intraresidual, and hence intrachain only.
The ambiguity concerning intraversus interchain NOEs arises solely due to the fact that the dimer structure is symmetric; otherwise residues n and n' (i.e. the same residue on the two chains of the dimer) would be distinguishable by their different chemical shifts. If only intrachain symmetry is assumed then the assumption that all "sequential" NOEs are intrachain is valid. On the other hand, if even a few of these sequential NOEs are interchain, then the identical chemical shift requirement imposes interchain symmetry in addition to intrachain symmetry. In other words, if the d,N connectivity observed between residues n and n+l (and hence between n' and ( n + l ) ' ) , is interpreted as an interchain connectivity between n and ( n + l ) ' , the chemical shift symmetry requires that there must be a d,N connectivity between n' and n + l . This additional interchain symmetry would place more stringent constraints on the structure. Also, from consideration of steric hindrance alone, it appears unlikely that the interchain symmetry requirement could be satisfied in any arrangement of the dimer structure. This is supported by the results of a set of calculations using conformational analysis alone (described under "Conformational Analysis"), without any consideration of the NOE data.
For the deterministic approach, both intra-and interchain distance constraints were initially included for all NOE connectivities involving side chain protons. Later, during the process of minimization of the penalty function for distance constraints, those particular constraints that contributed to large violations were gradually removed in an attempt to minimize the distance violations.
For the probabilistic approach, the sequential NOEs were the only NOEs used. Since distance constraints were not used explicitly in this method, building the spatial structure did not require that any prior decisions be made regarding whether the other observed NOEs (involving side chain protons) were intra-or interchain NOEs. All unconstrained energy-refined structures were carefully examined to check for consistency with experimentally observed NOEs. During this examination, both intra-and interchain distances were checked to ensure that the selected structures were consistent with at least the minimum number of experimentally observed NOEs involving side chain protons in the particular dimer fragment in question.
Structure Determination Assuming All Sequential Connectivities to be Intrachain-The set of 55 approximate interproton distance constraints per chain, derived from the observed NOE data, was used both with, and without the additional symmetry constraint, to calculate three-dimensional structures of the Thr3 to Pro14 segment, with the specific purpose of determining whether the NMR data indicated a parallel or an antiparallel arrangement of the two chains linked by the two interchain disulfide bonds. The Deterministic Approach-The more commonly used DG approach, followed by energy refinement, was used first.
Several DG, DGREM, and DGRMD structures of the Thr3-M e P segment were obtained, all of which roughly satisfied the NOE constraints within this dimer fragment (with small total distance violations). The structures did not fall into any closely related sets, and showed large variations in the main chain conformation when compared in pairs. Addition of symmetry as a constraint in the energy refinement also gave numerous possible solutions (with identical monomer conformations in the dimer), again with a wide variation in the main chain conformation, as described above. This approach showed that "constrained structures, roughly satisfying all NOE requirements, were possible for both the parallel and the antiparallel dimer forms. However, all of the structures calculated in this way contained several dihedral angles well outside the sterically allowed regions. Therefore, comparison of the calculated conformational energies was not an adequate criterion, either for selecting a set of preferred structures from the many converged constrained structures, or for deciding whether the NMR data indicated a parallel or an antiparallel arrangement of the two chains in the dimer. We concluded that the number of constraints available was not sufficient for this strategy to work.
The Probabilistic Approach-We then applied the BUILD procedure described above, using the d connectivities shown in Table 11 ( d ,~, dNN, d@N) to estimate corresponding regions in d,$ space for each residue, shown in Table I11 (column 2).
The most probable @,$ values corresponding to each region (Table 111, column 3) were used as the starting set of C$ and $ angles for the 12-residue monomer segment, Thr3-ProI4. The d,N connectivity data for three of these 12 residues, (Am4, A d , and Phe"), are ambiguous, permitting two possible regions in d,$ space (see Table 111, column 2). However, the two possible regions are close to each other in conformational space, and an average of the most probable C$, $ values corresponding to the two regions was used as the starting dihedrals in these cases. Extension of this method, and consideration of the intraresidual NOE data between amide-and Co-protons was used to obtain the possible combinations of x1 and xz angles for the side chain conformations (Sherman and Johnson, 1991;Sherman and Johnson, 1993). A four-step "buildup" procedure was followed to construct the final threedimensional structure: (i) The Cys7-Cys" segment, containing the interchain disulfide bonds was constructed first, because of the four long-range NOE restrictions between the Coprotons of Cys7 and Cys" (see Table 11) present in this segment. The Ile9 and Glu" side chains were initially truncated to alanine, making the testing sequence Cys-Pro-Ala-Ala-Cys. (ii) Phe" was added to this sequence, and, in two subsequent steps, alanines at positions 9 and 10 were replaced by Ile9 and GlulO. (iii) The Thr3-Asn6 segment was then added. As in (i) and (ii), calculations were first performed with A m 4 and Am6 replaced by alanine. (iv) Met13 and were finally added to complete the segment. This procedure reduced the total number of calculations required from 4096 X 2 (for parallel and antiparallel structures) to only 90. Proa and Prol4 were modeled to be in the trans configuration, as indicated by the presence of strong NOE cross-peaks (Table 11).
The dihedral angles estimated by the FISINOE program were used as the starting angular coordinates. The Cys7-Cys" segment was built using the interactive graphics package, INSIGHT. CHARMM was then used to "patch" two such segments in either a parallel or an antiparallel fashion, via the two interchain disulfide bonds. CHARMM was also used for energy minimization, with gradual reduction, and finally elimination, of force constants for dihedral constraints, as described under "Experimental Procedures." Arrangement of Monomer Chains in the Dimer-The ques-  Both regions are shown in column 2 for each of these residues. Column 3, therefore, contains two sets of the most probable b,$ values (in degrees) corresponding to these two regions.
* Initial conformations. The initial x1 and x2 values are indicated as rotamers g+, g-, t, and p , representing 60, -60, 180, and go", respectively. The IUPAC-IUB conventions (Hoffmann-Ostenhof, 1974) were followed in naming the torsion angles. (For valine, it should be noted that these conventions are different from those used in CHARMM and INSIGHT.) Final conformation. Column 5 shows the rotamers describing the various side chain conformations, including the energetically indistinguishable set of conformations for the side chains of Ile', Glu'', and Met13 that lead to the set of 16 energy refined unconstrained structures consistent with the NMR data. Values of b,$ and x1,x2 (in degrees) shown in columns 6 and 7 are for one of the 16 conformations (see text for angular root mean square deviations for pairs of these 16 structures).
tion of parallel versus antiparallel arrangement of the monomers in the dimer fragment was answered at the first phase of the calculations in step (i), using the Cys7-Cys" segment only. A summary of the results for this step of the calculations, using both parallel and antiparallel arrangements of the monomer segment Cys-Pro-Ala-Ala-Cys to form the dimer, is given in Table IV. Of the four possible conformations corresponding to different combinations of x1 values for the pair of cystines in the monomer, none satisfied all four long-range NOE requirements between the CB-protons of Cys7 and Cys" (considering both intra-and interchain connectivities) for a parallel dimer structure. Two of the parallel structures (see rows i and ii in Table IV) failed to satisfy all sequential NOE requirements, as well. This is apparent in the large deviations (Table IV) in backbone conformation from that estimated using sequential NOE data in FISINOE. The use of distance constraints, in addition to dihedral constraints, also did not lead to any unconstrained parallel dimer structures consistent with all NOE data. In the antiparallel dimer structure, it was possible to select one of the four combinations of xl conformers for the pair of cystines in the monomer (with x1 = g-= -60" for both Cys7 and Cys"; see row 1 in Table IV), since only are shown, since armsd values for $J are similar for all eight "The maximum deviation in $ from estimated values has been included to show that some low energy conformations (see rows i and ii in B) correspond to backbone dihedrals that have deviated more than 60" from the initial estimates consistent with sequential NOEs (or more than 30, where the armsd, o = 20", for FISINOE estimates of ($,$) in each region). This, in turn, implies that the corresponding unconstrained structures do not satisfy all of the sequential NOEs.
Addition of distance constraints to satisfy long-range NOEs resulted in large deviations of backbone dihedrals from FISINOE estimated values, implying violation of sequential NOE requirements. When both long-range and sequential NOES were included as distance constraints, the resulting structures contained backbone dihedrals in sterically forbidden regions. e The single structure satisfying all sequential and long-range NOE requirements.
only this conformation satisfied all sequential and long-range NOE requirements. One interesting difference between the parallel and antiparallel structures was that the symmetry requirement was satisfied, even in the absence of any explicit symmetry constraints, in all four of the unconstrained antiparallel structures, but in none of the parallel structures. Since the probabilistic approach uses a priori information regarding empirical distribution functions for backbone conformations in addition to the NMR data, the available NOE data were sufficient for this method to distinguish between parallel and antiparallel structures.
Conformational Analysis-Since the results above were obtained assuming all daN, dNN, dSN, and dNg connectivities to be intrachain for the Cys7-Pro8-Alag-Ala10-Cys11 sequence, conformational analysis was performed for this sequence to check whether other conformations (not predicted by the assumed intrachain sequential d connectivity patterns) were energetically favorable for this dimer fragment containing interchain disulfide bonds. For simplicity, only two main classes of backbone conformations were considered for each residue in creating the starting structures: (i) the twisted or (Y conformation, with (4,+) = (-60, -60); and (ii) the extended or / 3 conformation, with (d,+) = (-60, 135). Since the actual sequence has a charged residue in position 10 (Glu"), the (60, 60) conformation (or a~) , on the right side of the (d,+) map, was also considered for Ala". The structure obtained above, by assuming all sequential d connectivities to be intrachain, may be represented as the /3/3a(~a( g-,g-) conformation under this notation, with x1 angles for both Cys7 and Cys" close to g-(see Table IV, row 1). In accordance with the requirement of chemical shift symmetry, identical starting conformations were used for the two monomer chains. Considering that the x1 angle for each of the 2 cystines per monomer (Cys7 and Cys") may take three values (g-= -60", t = 180 "C, g+ = 60"), this leads to a total of 108 x 2 structures (for parallel and antiparallel forms). As a preproline residue, Cys7 was restricted to the /3 conformation. The conformational energies calculated (using CHARMM) for the unconstrained structures were compared with that of the structure in the @/3acua conformation, with x1 = -60" or g-for both Cys' and Cys".
When a structure with a comparable energy was obtained, the conformation of each residue was examined to see which of the sequential d connectivities (previously assumed to be intrachain) were not present. Interproton distance calculations were then performed, using the coordinates of the energy-refined unconstrained structure, to check whether the corresponding interchain connectivities were present instead, in order to satisfy the NOE requirements. A summary of the results are given in Table V. Eight structures with conformational energies comparable to the P@aaa(g-,g-) conformation (see Table V, row 1) were found. In all of these structures, the monomers were linked in the antiparallel fashion to form the dimer, and all retained conformational symmetry in the unconstrained structures (i.e. contained identical conformations for the two monomers). However, interchain d connectivities (in place of missing intrachain sequential d connectivities) were not found in any of these structures, so that none of them satisfied all NOE requirements.
(It is important to note here that in a few of these structures, the long range NOEs relating Cys7 and Cys" could be satisfied by intrachain, rather than interchain connectivities. Therefore, whether the structure is parallel or antiparallel, it cannot simply be assumed that NOEs relating Cys7 and Cys" must be interchain.) Conformational energies for all parallel dimer structures were much higher than that obtained previously, assuming all sequential connectivities to be intrachain (see Table IV). Interchain d connectivities (in place of missing intrachain sequential d connectivities) were not found in any  Initial backbone conformations used are compared with conformations in the unconstrained, energy refined structures (where the initial values of ($J,$) for twisted (a) and extended (0) conformations are assumed to be (-60", -60") and (-60",135")). The angular root mean square deviations (armsd) and maximum deviation from the initial values of $J and $ (columns 7 and 8) have been evaluated in order to estimate the "goodness of fit" of the unconstrained structures to the initial conformations.
The "reference" structure, i.e. the structure obtained assuming all sequential d connectivities to be intrachain. Note that of the nine conformations shown in this table only this conformation satisfies all NOE requirements. of these structures; nor were all the long-range NOE requirements satisfied. Also, none of these parallel dimer forms retained conformational symmetry in the unconstrained structures (i.e. conformations for the two monomers were not identical in the unconstrained structure), thus violating the identical chemical shift requirement. Therefore, use of the chemical shift symmetry constraint, with the help of conformational analysis, was sufficient to demonstrate that the monomers in the carboxyl-terminal dimer fragment of Fn must be linked in an antiparallel fashion by the interchain disulfide bonds.
At this juncture, it is important to note that the observation of NOE contact between Cys' and Cys" did not, of itself, rule out a parallel structure; we had to do extensive additional conformational analysis to show that only an antiparallel structure satisfied all of the NOE constraints. The analysis by  noted the existence of a ROESY crosspeak between the Cys' H, and the Cys" HBz, (our numbering) and stated that "a close distance between the 2 Cys residues unambiguously indicates an antiparallel" arrangement. However, the chemical shift symmetry makes it impossible to distinguish whether this ROESY cross-peak is intrachain or interchain. Thus, the ROESY data alone, without any modeling calculations, could not rule out a parallel structure in which the 2 cystines in the monomer are spatially close, but the disulfide bridge links Cys7-Cys7' and Cys"-Cys"', since ROESY cross-peaks between residues 7-7' and 11-11' cannot be observed due to the chemical shift symmetry. In fact, as we have shown above, it is quite possible to obtain a conformation in which Cys' and Cys" are close enough within the monomer to produce NOE contact. It is only the additional modeling that demonstrates that this monomer conformation cannot then be coupled with a second monomer to form a symmetric dimer that satisfies all NOE constraints. Thus, the analysis of  assumed an antiparallel arrangement, but did not prove it.
Structure Consistent with NMR Data-At least nine different antiparallel dimer structures were found with comparable conformational energies for the Cys-Pro-Ala-Ala-Cys sequence, with four different main chain conformations (~paacu, paaaa, pppaa, Papaa). However, as described in the above discussion on conformational analysis, only one of these (@@cuacu (g-,g-) in Table V, row 1) satisfied all NOE data: the structure obtained assuming all sequential d connectivities to be intrachain. Therefore, even if the Cys7-Cys" segment did exhibit multiple conformations in solution, it appears likely that a substantial population exists in the ppaaa( g -g ) conformation. Only this structure was extended to build the three-dimensional structure of the Thr3-Pro14 segment. Intraresidual and sequential d connectivities shown in Table I1 were assumed to be intrachain in all subsequent calculations. Only the antiparallel arrangement of the two chains was used, with x1 = g-for both Cys7 and Cys".
On addition of Phe" to the Cys7-Cys" segment, only one of two possible conformations, with x1 -60" (or g-) and xz = 90" (or p ) , for the PheIz side chain was found to satisfy the long-range NOES between the CB-protons of Cys7 and the ring protons of Phe" (taking into account both intra-and interchain connectivities). Since there were no strong long-range NOE constraints relating the Cys'-Phe'' segment to the rest of the structure, all subsequent calculations in the build-up procedure used the following energy criteria to choose the set of most probable conformations at each step: for a set of x1 or xz rotamers, all conformations with energy greater than the lowest energy conformation by 4 kcal/mol (calculated using a dielectric constant of lo), were eliminated. On replacing alanine by Ile9 in the Cys-Pro-Ala-Ala-Cys-Phe segment, four energetically equivalent conformations were obtained.
Replacing the second alanine by Glu" led to 16 conformations (4 X 4), eight of which were eliminated by energy criteria. Similarly, extending the sequence by the Thr3-Asn' segment led to 32 possible conformations for the Thr3-Phe" segment (8 x 2 x 2), 24 of which were eliminated by energy criteria.
Addition of Met13-Pro14 resulted in 32 possible conformations (8 X 4 x I), 16 of which were selected, using energy criteria, to be the final set of structures consistent with the NMR data. Although symmetry was not used explicitly as a constraint in the energy minimization process, the symmetry built into the backbone conformation at the start was largely retained in the final set of energy-refined unconstrained structures. Similarly, long-range NOE requirements were also found to be satisfied in the finaI 16 structures, and were obtained as a by-product of energy refinement, without the use of distance constraints in the minimization process.
Estimation of Precision-The backbone conformations of the set of 16 final structures were very similar, and were indistinguishable by x ' statistical criteria. The 16 structures consisted of combinations of side chain conformations for Ile9 (tt, tg' , g g -, g-t), Glu' O (g-t, g-g-) and Met13 (g-t, g-g-1 that are consistent with the NMR data, and are indistinguishable by energy criteria. The rotamers representing these 16 side chain conformations are shown in Table 111, column 5 . The angular root mean square deviation (armsd) in backbone conformations for pairs of structures within this set of 16 is 6" for 4 and 10" for J/ (average of armsd values for all pairs within the 16 structures). The armsd for the estimated values of 4 and J/ in each region is about 20" (Sherman and Johnson, 1992). Table I11 also compares the dihedral angles estimated by the FISINOE program on the basis of the available NOE data, with the dihedrals in one of the 16 representative final structures. The armsd between FISINOE estimates and calculated values of backbone dihedrals is 25" for 4 and 29" for J/ (average of armsd values for all 16 structures), and the estimated and calculated backbone structures are statistically indistinguishable, using x ' criteria.
Unlike the types I, 11, and I11 repeat units of Fn, which contain dominant structural features that are common to many proteins, the structure of the carboxyl-terminal dimer segment reported here is somewhat unusual. Fig. 4 shows the conformation of the Thr3-Pro14 segment in the dimer form, with a single set of conformations for the side chains of Ile9, Glu", and Met13 (as shown in Table 111). The structure consists of two twisted or coiled segments, Thr3-Asn6 and Ile9-Phe12, connected by an extended region at Cys7-Pro8. These twisted regions may serve as recognition sites for the monomers, and thus help to bring the 2 cystines in each monomer into close proximity, specifically in the antiparallel orientation, so that the required interchain disulfide bridges are formed in the correct manner in the dimer. The two ends of the Thr3-Pro14 segment are connected by interchain H-bonds involving backbone atoms (NH of Am4 of one chain to 0 of Pro14 of the second chain). In addition to several inter-and intrachain H-bonds involving side chains, there are four pairs of intrachain H-bonds involving backbone atoms in each monomer. The donor-acceptor pairs are: (7, 3), (11,8 ) , (12,9), and (13,lO). The exposed surface of the two twisted regions contain several hydrophobic side chains (Va15, Ile9, and Phe" followed by Met13 and However, these may be covered by the rest of the monomer (LeuI5-G1uz6) folding back over itself. The aromatic ring of Phe" in each monomer lies across the two disulfide bridges, and the hydrophobic interactions involving these aromatic side chains may help stabilize the disulfide bonds. The backbone structure of the Cys7-Cys" segment forms a loop that brings the 2 cystines in the same chain (Cys7 and Cys") close enough to make intrachain disulfide bonds also possible. Why interchain and not intra- chain disulfide bonds are observed in the 6-kDa fragment of Fn remains an intriguing question. Several x-ray protein structures are known in which intrachain S-S bridges connect 2 cystines at i and i+4 positions (e.g. glyceraldehyde-3-phosphate dehydrogenase, insulin, lysozyme, proteinase (trypsin) inhibitor, ribonuclease T). The backbone conformation of this region in some of these structures is quite similar to that determined for the Cys7-Cys" segment of the 6-kDa fragment. It may be conjectured that hydrophobic interactions, such as those between Pro' and Phe" may hinder the formation of an intrachain disulfide bond in the monomer, and that this hindrance is removed through stronger interchain hydrophobic interactions in the dimer. In all 16 final structures, the long-range NOEs relating the Cp-protons of Cys7 and Cys" were found to be interchain, while NOEs relating the Cpprotons of Cys' and ring protons of Phe" were found to be intrachain.
A thorough understanding of the functions of Fn requires knowledge of the precise spatial arrangements of various binding domains and their interactions in the two similar subunits of the dimeric Fn molecule. Our NMR demonstration of the antiparallel arrangement of the interchain disulfide bridge near the carboxyl termini of Fn suggests that similar binding domains, such as the gelatin and cell-binding domains in different subunits, may be arranged in a diagonal manner rather than in a mirror image. The present result is consistent with previous work by Skorstengaard et al. (1986), which also suggested an antiparallel arrangement for the interchain disulfide bridge of Fn, based on HPLC patterns of peptides derived from the carboxyl-terminal 6-kDa fragment. The conformation of the Val5-Cys" region differs substantially from either of the average conformations proposed by . Several of the dihedral angles in the work of  fall outside the usual sterically allowed regions. Since only the average conformations were reported by , we could not compare individual structures. There are significant differences between the observed NOE contacts for our work and those reported by , thus the structural differences reflect real differences in solution conformations, and not simply different approaches to determining the structure. A detailed comparison suggests that truncation of the peptide chains at Val5 and Cysll (our numbering, corresponding to their V1 and C7) permits substantially increased freedom for the disulfide-linked cyclic peptide ring due to loss of constraints from the extended peptide chain, and probably does not maintain that fragment within its native environment and conformation. Thus, we propose that the structure determined here from a larger fragment of the carboxyl-terminal region is more likely to represent the native conformation of the disulfide-linked segments of the carboxylterminal region of the two subunits. Fn is an essential component of the extracellular matrix that controls cell growth, cell shape, and differentiation (Hynes, 1990;Mosher, 1989). I n uitro, Fn molecules selfassemble into fibrils reminiscent of the fibrillar structures seen in the matrix (Vuento et al., 1980). Although the detailed structure of the Fn fibrils is not known, the current model for the assembly of the Fn fibrils, proposed by Hormann (1982), was based on a parallel interchain disulfide bridge pattern. According to this model, the fully extended form of Fn assembles into a half-staggered array to form 5-nm fibrils. Our determination of an antiparallel arrangement for the two Fn chains suggests that the fibril formation process may be far more intricate than presently perceived. It is of some interest to note that, relative to each other, the two chains extend away from the bridge region in an essentially antiparallel fashion, in sharp contrast to the "folded," parallel arrangement suggested by . Thus, one might speculate that the two chains may extend away from each other in fibril formation, although more extended structural information would clearly be required to definitively determine this arrangement.
In conclusion, a three-dimensional structure has been determined for 24 residues of the dimer segment that contains the two interchain disulfide bonds within the 6-kDa carboxylterminal fragment of human plasma fibronectin. A build-up strategy for obtaining spatial structures was described, using the FISINOE method for evaluating local conformations from sequential d connectivities. The 2D NMR data are consistent with only an antiparallel arrangement of the two monomers connected via the two interchain disulfide bridges. initial guidance in the use of the INSIGHT and CHARMM software