Type I Collagen Segment Long Spacing Banding Patterns EVIDENCE THAT THE a2 CHAIN IS IN THE REFERENCE OR A POSITION*

Densitometric scans of electron micrographs of type I collagen segment long spacing crystallites stained with uranyl acetate or phosphotungstic acid and uranyl acetate have been correlated with computer-synthe-sized scans derived from the sequence of the al(1) and a2 chains. Three models that differ in the location of the a2 chain were used in the computer synthesis; Models A, B, and C have the a2 chain in the A, B, and C chain positions, respectively. For all 13 experimental scans, the order of decreasing correlation with the models was found to be A,B,C. The probability of get- ting the same order of decreasing correlation all 13 times is 6/613. It was also determined at the 0.99 confi- dence level that the mean of the differences in the correlation coefficients among the models is greater than 0, supporting the conclusion that sequence-de-rived models best fit the experimental data when the a2 chain is in the A position. Our results also agree with recent studies that show that uranyl ions bind to both positively and negatively charged residues on collagen type I. The I collagen proposed based compositional diffraction is known Each molecule is composed of and chains wound around each other rigid collagen chain consists triplets, in the region, amino the of the a2 the of the triple charge distribution for synthesis of the the of was a value Glu Asp triple of summing distribution cosine in weighting consist- the best (&.

Densitometric scans of electron micrographs of type I collagen segment long spacing crystallites stained with uranyl acetate or phosphotungstic acid and uranyl acetate have been correlated with computer-synthesized scans derived from the sequence of the al(1) and a2 chains. Three models that differ in the location of the a2 chain were used in the computer synthesis; Models A, B, and C have the a2 chain in the A, B, and C chain positions, respectively. For all 13 experimental scans, the order of decreasing correlation with the models was found to be A,B,C. The probability of getting the same order of decreasing correlation all 13 times is 6/613. It was also determined at the 0.99 confidence level that the mean of the differences in the correlation coefficients among the models is greater than 0, supporting the conclusion that sequence-derived models best fit the experimental data when the a2 chain is in the A position. Our results also agree with recent studies that show that uranyl ions bind to both positively and negatively charged residues on collagen type I.
The triple helical structure of type I collagen that has been proposed based on compositional and x-ray diffraction data is well known (1). Each molecule is composed of 1 a2 and 2 al(1) chains wound around each other to form the rigid collagen helix. Each a chain consists of 338 triplets, Gly-X-Y, in the triple helical region, where X and Y can be any amino acid except glycine and tryptophan. At each end of the triple helical region are short nonhelical telopeptides. Each a chain is staggered by 1 or 2 residues in the triple helical region with respect to the other 2 chains so that there is a glycine a t every axial position along the molecule. The A chain is the a chain that has the first glycine in the triple helical region starting from the NH, terminus; the B chain is shifted by 1 residue with respect to the A chain so that its first triple helical glycine is the second of the molecule. The C chain is shifted by 2 residues with respect to the A chain so it has the third triple helical glycine in the molecule. The relative shift between chains allows for residues separated by l or 2 residues in neighboring chains to have the same axial position along the molecule (2). In type I and I11 collagens, this becomes important since many of the charged residues are found in * This work was supported by National Science Foundation Grant PCM 8118268 and National Institutes of Health Grant GM 30425. Parts of this work are based on the Biomedical Engineering Senior Project of E. B. at Boston University's College of Engineering. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
$ T o whom correspondence should be addressed. attractive pairs separated by 1 and 2 residues.
In the presence of adenosine triphosphate at acid pH, collagen molecules align without stagger to form a lateral aggregate termed SLS' crystallite. These aggregates have a characteristic banding pattern when stained with heavy metals and viewed in an electron microscope, consisting of 58 bands according to the nomenclature introduced by Bruns and Gross (3). The positive staining pattern generated is the result of interactions between the stain and the charged amino acid residues. Studies by von der Mark et al. ( 4 ) suggested that a direct correlation could be made between the position of charged residues and the staining pattern of a 112-residue peptide in the COOH terminus. The staining pattern of the SLS is therefore directly related to the charge distribution of the collagen molecule. Since the positions of the charged residues in al(1) and a2 are not identical (5) and the chains are displaced by 1 or 2 residues relative to each other in the molecule, the SLS banding pattern potentially contains information concerning the position of the a2 chain in the triple helix.
The exact nature of inter-and intrachain interactions that stabilize the molecule as well as intermolecular interactions that stabilize fibrils depends on the location of al(1) and a2 chains, i.e. whether they are in the A, B, or C position. Several models of intrachain, interchain, and intermolecular interactions have been proposed which demonstrate the role of charge-charge and hydrophobic interactions in the genesis and stability of type I fibrils (2, 5-14). Experimental studies (14) suggest that many of the charged residues on collagen type I are not free but are involved in interactions. Interactions between attractive charged pairs separated by 1 and 2 residues may be characterized by both electrostatic and hydrophobic components (2). In this report, we present indirect evidence that the a2 chain is in the A position.

MATERIALS AND METHODS
UA-stained SLS of rat tail tendon collagen were prepared by dialysis of rat tail tendon collagen solutions of 4 "C wrsus 0.44> ATP, pH 2.7, overnight (15). A drop of the solution was placed on carbon/ collodion-coated copper grids and positively stained with saturated aqueous UA. The pH of the unbuffered staining solution was 3.8.
Stained crystallites were viewed with a Philips 300 transmission electron microscope and photographed at, a magnification of 60,000.
PTA-1JA-stained calf skin type I SLS were obtained from Dr. Romaine Bruns, Developmental Biology Laboratory, Massachusetts General Hospital, Boston ( 3 ) . One SLS was obtained from Ref. 16, and was similar to PTA-UA-stained SLS based on visual comparison. Negatives of UA and PTA-UA-stained SLS were scanned with a Gilford spectrophotometer in the visible range using a gel scanning attachment. The scans were manually digitized and entered into a PDP-I1/03 minicomputer for data analysis.

9653
Theoretical charge density profiles of type I collagen SLS were synthesized using the sequences of the chick a l ( 1 ) (17) and calf a2 chains by varying the position of the a2 chain among the A, B, and C chain positions of the triple helix. The axial charge distribution of each a chain for the synthesis of the models based on the positions of both negatively and positively charged amino acid residues was obtained by aszigning a value of 1 to the positions of charged residues (Glu, Asp, Arg, Lys, Hyl, His) and a value of 0 to all other positions. For the synthesis of models based on the positions of the negatively charged residues, a value of 1 was assigned to locations of Glu and Asp only. The charge distribution of the triple helix was obtained for different positions of the a2 chain by axially shifting and summing the charge distribution of the 3 chains. Using this approach, the sequence of each a chain becomes a series of ones and zeros which were shifted (B and C chains) and summed to generate the profile of the three models. For example, in Model A, the a2 chain is in the reference or A position and the charge distribution for the molecule was obtained by summing the charge distribution of the a2 chain with the charge distribution of an al(1) chain axially shifted by 1 residue (B position) and with another al(1) chain shifted axially by 2 residues with respect to the A chain (C position). Models B and C were obtained in a similar manner with the a2 chain in the B and C chain positions, respectively. The 3 model scans were smoothed using "top hat," triangular, and 1 + 2 cosine as the weighting functions in a weighted moving average. The triangular weighting function consistently yielded the best results (&. highest correlation coefficients).
The experimental scans were digitally stretched and shifted until hands 15 and 36 were visually aligned with peaks 15 and 36 of the synthesized scans. The average value of the experimental scan was then set at 500 and the minimum value was set at 0. An average scan was generated by summing and normalizing the experimental scans. The limiting factor in the resolution obtained from the micrographs was the sampling interval. Each experimental scan typically had approximately 250 points before the digital stretching was performed with a molecular length of -1000 residues, this corresponds to a resolution of about 1.2 nm, or 4 residues. A Houston Instruments Digital Plotter (HIPLOT) was used in generating Figs. 1

and 2.
Correlation between experimental scans and models was performed by translating each experimental scan versus each model to get the highest correlation coefficient between each set, similar to the technique used by Meek et al. (18). The correlation coefficient ( r ) takes on values between +1 and -1; when r = 0, there is no correlation between the scans, when r = +1, the scans have the highest positive correlation, and when r = -1, the scans have the highest negative correlation. The formula used to calculate r is as follows The experimental and model data are the X , and Y, values, and n is the total number of points used in the calculation. The subscript i is the number of the residue with respect to the first triple helical glycine. For the UA-stained correlations, i = 255 to 705 (n = 451), while with the PTA-UA-stained correlations, i = 165 to 945 ( n = 781); these are approximately SLS hands 13 through 38 and 9 through 52, respectively. Over these ranges, the experimental scans were most similar to each other and yielded the highest correlation coefficients with each other and with the models.
A constant axial displacement per residue ( h spacing) was assumed for the SLS. As shown below, 2 different staining techniques (UA uersus PTA-UA) yield the same order of correlation, which suggests that the h spacing in the SLS crystallites is approximately constant. The high correlation between the staining pattern and location of charged residues indicates that ATP used in forming the SLS does not significantly change the charge distribution of the molecule.

RESULTS AND DISCUSSION
The staining pattern of collagen SLS crystallites is well known to represent the charge profile of the molecule. Until recently (22), it was believed that negatively charged residues stain with UA whereas both negatively and positively charged residues stain with a combination of PTA and UA. Since the a chains are staggered axially by either 1 or 2 residues in the collagen triple helix and the charge profile for the a2 chain is different from that of a I ( I ) , the predicted SLS banding pattern based on the sequence depends on the location of the u2 chain. In our studies, we considered the 3 possible locations of the a2 chain: the A, B, and C positions (Models A, B, and C, respectively). The scans of the 6 UA-stained type I SLS and their average are shown in Fig. 1, and the scans of the 7 PTA-UA-stained type I SLS and their average are shown in Fig. 2. Tables 1 and I1 show the correlation matrix for the scans derived from UA-and PTA-UA-stained SLS. As seen from these tables, the experimental scans show a high degree of correlation with each other as well as with the average scan. The experimental scans correlate with the average scan better than with the individual scans for both staining techniques. The average of the correlation coefficients shown in Row 7 of Table I (0.861) is less than the average of How 8 in Table I1   (0.906). This result suggests that PTA-UA-staining may be more reproducible.
A smoothed version of the Model A, B, and C scans, derived from the positions of the negatively charged residues, is shown in Fig. 1; it is evident that Models A, B, and C are almost exactly alike and visually there is no way to decide which model fits the experimental data best. A smoothed version of the Model A scan derived from the positions of the negatively and positively charged residues is shown in Fig. 2; Models B and C are visually identical with Model A, as in Fig. 1, so they are not shown. Although the differences among the models appear too small to tell them apart, the correlation coefficients calculated by the computer reveal a small yet significant difference among the models as is shown below.
The maximum correlation coefficient between experimental and model scans generally occurs when a triangular weighted moving average of width, N = 19, is performed on the charge

TABLE I1 CorrelatLon matrix for PTA-UA-stained collugen type I SLS s c m s
Experimental scans (1-7) and the average scan (8) are correlated against each other. The smoothing done by the weighted moving average is required since the sequence-generated charge profile has a higher resolution than the electron micrographs. Meek et al. (18) found that the correlation between the sequence and the native banding pattern was maximized using a top hat or square weighting function with a width of 11 residues. Fig. 3 shows the results of correlations performed between the average of the UA-stained SLS scans and the 3 models which represent the positions of the negatively charged residues. Note that the order of decreasing correlation coefficients is A,B,C. The maximum correlation coefficient ( N = 19) of 0.68 is below what was expected based on the high correlation of the sequence with the fibrillar banding pattern reported by Meek et al. (18). Later it was determined that the maximum correlation coefficient increased to about 0.88 when the average of the UA-stained SLS scans was correlated with the models obtained from the positions of both the negative and positive charges in the sequence (Fig. 4). This figure also indicates that the order of decreasing correlation is A,B,C. If the order of correlation between the models and SLS scans were a random event, the probability of the same order occurring all 13 times would be 6/6':' or less than 1 in 10''. Based on probability, it is unlikely that the correlation order A,B,C is a random event. Although the difference among the correlation coefficients is very small (Tables I11 and IV), it was determined that at the 0.99 level of confidence the mean of the differences in the correlation coefficients among the models is greater than 0. Based on these experimental data, we conclude that the a2 chain is in the A or reference position. Previous theoretical interaction studies have suggested that the a2 chain is either in the A (19) or B position (20,21).    It was also possible to conclude based on correlation with the models (Figs. 3 and 4) that UA stains both negatively and positively charged residues in type I collagen. This conclusion is consistent with the recent findings of Tzaphlidou et al.   correlation coefficient with UA staining (0.88) was similar to that obtained with PTA-UA-staining; however, with the latter staining technique, the correlation could be performed over a larger portion of the molecule.
As an extension of these findings with the SLS crystallites, we considered how the correlation of the native banding pattern of D-staggered fibrils with a sequence-generated pattern was affected with the a2 chain in different positions. If one assumes that the correlation order A,B,C of the triple helical models is correct, then it seems reasonable that the native banding pattern should correlate in the same order.
Meek et al., however, did not find significant differences in the correlation coefficient with the a2 chain in different positions (18). The results of our initial studies indicate that the telopeptide model used in the synthesis of the native banding pattern affects the correlation more than the position of the a2 chain. We have found that by leaving out the telopeptides in the synthesis of the native banding pattern the order of correlation of the triple helical models (A,B,C) is preserved. A half-extended telopeptide model also maintains the order of correlation of the triple helical models; however, the differences among the models are very small. A fully extended telopeptide model gives us a correlation order of C,B,A. We feel these results indicate that a useful criterion for evaluating different telopeptide models is the effect they have on the correlation order of the triple helical models. Further work on different telopeptide models is now under way.