Highly Restricted Distributions of Hydrophobic and Charged Amino Acids in Longitudinal Quadrants of 0-Helices”

Helix formation in folding proteins is stabilized by binding of recurrent hydrophobic side chains in one longitudinal quadrant against the locally most hydrophobic region of the protein. To test this hypothesis, we fitted sequences of 247 alpha-helices of 55 proteins to the circular (infinite) template (symbol; see text) to maximize the strip-of-helix hydrophobicity index (the mean hydrophobicity of residues in (symbol; see text) positions). These template-predicted configurations closely matched crystallographic structures in 87% of four- or five-turn helices compared. We determined the longitudinal quadrant distributions of amino acids in the template-fitted, sheet projections of alpha-helices with respect to the best longitudinal, hydrophobic strip on each helix and to the N and C termini, interiors, and entire helices. Amino acids Leu, Ile, Val, and Phe were concentrated in one longitudinal quadrant (p less than 0.001). Lys, Arg, Asp, and Glu were not in the quadrant of Leu, Ile, Val, and Phe (p less than 0.001). Significant quadrant distributions for other amino acids and for termini of the helices were also found.

etc.) that form an axial hydrophobic strip when the subsequence is coiled as an a-helix (1). Perutz et al. (2) originally observed in the a-helices of hemoglobin the recurrence of invariant, nonpolar residues every 3.6 residues, on the average, making the interior faces of the helices nonpolar. Schiffer and Edmundson (3) created the wheel projection to identify such segments with helical potential. Eisenberg et al. (4) and Finer-Moore and Stroud (5) developed methods based upon amphipathic moments to predict a-helices. Kaiser and Taylor (6) tested the function of a longitudinal hydrophobic surface on a-helices to promote folding against a hydrophobic surface.    Sheet projection of the template for an a-helix with an axial hydrophobic strip in quadrant 111. Squares, hydrophobic residues; circles, residues in quadrants I1 and IV; triangles, residues in quadrant I when three amino acids intervene between the hydrophobic residues. A through F are cycles.
In the folding of nascent proteins, hydrogen bonds between carbonyl and amido groups in the peptidyl backbone in adjacent cycles of a helix could be stabilized by the binding of recurrent hydrophobic residues along one longitudinal quadrant of the helix to a locally hydrophobic region. Additional hydrophobic residues not fitting this recurrent template might compete for helix formation (7). The longitudinal hydrophobic strip creates a line of reference that we have used to measure the quadrant distributions of amino acids along a helix. We have found that the longitudinal hydrophobic strip is narrow, that groups of amino acids are favored or unfavored in the longitudinal strip, and that termini of helices originate in specific quadrants with respect to the longitudinal hydrophobic strip.

MATERIALS AND METHODS
All sequences of a-helices were taken from the papers of Presta and Rose (8) and Richardson and Richardson (9). We report calculations based on both methods because, for 13 proteins common to both studies, helices determined by Richardson and Richardson averaged 1.2 amino acids longer at the N terminus and 2.2 amino acids longer at the C terminus than the corresponding helices determined by Presta and Rose.
A circular template for a helix with a longitudinal hydrophobic template held 18 symbols: strip was superimposed on the sequences of known helices. The This generic, helical template corresponded to a sheet projection with successive coils of the helix in slanting columns and with longitudinal quadrants in horizontal rows (Fig. 1). By convention, quadrant 111 had the greatest strip-of-helix hydrophobicity index (SOHHI'; the mean hydrophobicity in the Kyte-Doolittle scale of amino acids in 0 positions). We superimposed the infinite template onto a helical segment 18 times by attaching each of the 18 symbols to the first position in the segment. We chose the overlay which had the maximal SOHHI score. The amino acid sequences of 247 cu-helices identified in crystallographic structures of 55 proteins by other investigators (8,9) were placed in the sheet projection of an amphipathic helix ( Fig.  1) to maximize the SOHHI score. The 0 positions were placed in the third quadrant of the sheet projection, the A positions were placed in the first quadrant, and the remaining 0 positions were placed in the The abbreviation used is: SOHHI, strip-of-helix hydrophobicity index.

Standardized deviation from expected frequency
We compared the observed set of frequencies for the four quadrants to the null distribution by means of the x* goodness-of-fit test on three degrees of freedom and verified the results with the likelihood ratio test (14). The null probability distribution assigned 3/18 to quadrant I and 5/18 to quadrants 11, 111, and IV. To indicate possible false rejection of the null hypothesis, we distinguished results with p values of 0.05,0.01, and 0.001. With p = 0.001 and 50 independent tests, a type I error occurs 5% of the time using a Bonferroni correction for multiple comparisons. For each quadrant within a statistically significant distribution of frequencies, we standardized the deviation from the expected proportion, p , usin the uantity (observed frequencyp)/S.E., where S.E. was the standard error under the binomial model, ? p(1p ) / n , and n is the number of times the amino acid occurred. We used the standardization because the difference in proportions can be misleading. For example, if one amino acid appeared in quadrant I 20 out of 40 times while a rarer amino acid appeared in quadrant I 2 out of 4 times, both would have observed deviations of 50 -17 = 33%.

Presta and Rose
Richardson and Richardson N terminus  each terminus had four amino acids, only segments with 9 or more amino acids had nonempty interiors. Segments with 3 or fewer amino acids were excluded from analysis. For segments with 4-7 amino acids, the N and C termini overlapped. We counted the frequency of each amino acid in each quadrant in N and C termini, interiors, and the entire segment. Table I displays for each amino acid the stand-ardized deviations (the observed proportion minus the expected proportion divided by the standard error) over the four quadrants. We separated results into three levels of significance ( p = 0.05.0.01, and 0.001).
For analysis of certain helices, projections of the crystallographic coordinates of the a-carbon chain tracings were viewed along the helical axis, using the Quanta program of Polygen Corp. on a Silicon Graphics 4D/70GT computer system. From the geometric center of the helical projection, radii were drawn to quadrant 111 residues (0). The maximal sector angle was the absolute value of the greatest angle among those radii. The mean sector angle was the average of the absolute values for the angles between the most clockwise radius and the other radii.

RESULTS
Predicted quadrant orientations approximated crystallographic structures well. The structural relevance of our template-fitting model of a-helices was tested by direct examination of all fouror five-turn helices in 7CAT, 5CPA, 2CYP, 4LDH, 2MBN, lMBO, and 2SNS for alignment of the residues predicted to fall in the axial hydrophobic strip of quadrant I11 (8,9). Projections of x-ray crystallographic coordinates for 22 of 28 helices demonstrated a maximal sector angle among residues in the axial hydrophobic strip of 99" and a mean sector angle of 61" ( Table 11). We conclude that our assignments of amino acid residues to four quadrants, based on positioning of recurrent hydrophobic residues in one Quadrant Distributions of Amino Acids on Helices axial strip to maximize the SOHHI, closely matched crystallographic measurements. The hydrophobic amino acids Leu, Ile, Val, and Phe each occurred more frequently in quadrant 111 than in other quadrants ( p < 0.001). The absence of these residues from quadrants I, 11, and IV was as remarkable as their presence in quadrant 111, as reflected in the exceptionally large standardized deviations. This finding is consistent with the hypothesis that recurrent placement of such amino acids only in quadrant 111 stabilizes a-helical coiling in nascent proteins and that the cooperativity of binding the longitudinal hydrophobic strip (given a hydrophobic region in the protein against which to fold) governs the formation of local structure as a helix. Hydrophobic residues Leu, Ile, Val, and Phe in other quadrants of a putative helix would presumably compete for ahelix formation. Only a narrow axial hydrophobic strip appears to lead to helix formation.
The axial distributions of other amino acids were determined. The four charged amino acids Lys, Arg, Asp, and Glu were uniformly absent from the axial hydrophobic strip. Asn and Gln occurred less frequently in the axial hydrophobic strip. At the N and C termini of helices, Cys was more frequently in the axial hydrophobic strip. Averaged over entire helices, Cys was found more often in the axial hydrophobic strip, presumably to anchor a helix through a disulfide bridge into a protein. There was no radial preference for Pro occurring at the N or C termini of helices but, in the interior of a helix, Pro was more often in quadrant IV. Other amino acids, including Trp and Tyr specifically, showed no axial preferences.
The quadrants for the terminations of helices are presented in Table 111. A helix was more likely to start in quadrant IV with 3 residues preceding the hydrophobic one in quadrant 111. The N terminus was more likely to be an untethered loop. The last amino acid was more likely to be in quadrant I, two amino acids after the hydrophobic axial strip, and the helix was less likely to end in the hydrophobic strip.

DISCUSSION
We conclude that the template with the highest SOHHI predicts the quadrant orientations of amino acids in most ahelices well and that Leu, Ile, Val, and Phe occur almost solely within the axial hydrophobic strip. The predominance of Leu, Ile, Val, and Phe in one longitudinal, hydrophobic strip in ahelices is as striking as their absence from the other quadrants. This distribution supports the hypothesis that helical coiling is determined by stabilization of the helix against a hydrophobic region by recurrent hydrophobic residues in an axial strip. Presumably, the presence of hydrophobic residues in other axial quadrants can compete for helix formation, for example by forming @-pleated sheets if strictly alternating hydrophobicity is present in a segment facing a locally hydrophobic region of a nascent protein. Bowie et al. (10) and Matouschek et al. (11) have shown that hydrophobic residues along one side of a helix can insert in a hydrophobic core to stabilize protein structure.
The principles underlying both this study and a related analysis of coiling of a series of synthetic peptides that varied in the strengths of their axial hydrophobic strips (7) are applicable to the design of functional proteins and to the prediction of protein structure. Utilization of these principles might also aid the engineering of T cell-presented sequences either to decrease immunogenicity in proteins administered for diagnostic or therapeutic purposes or to increase immunogenicity and broaden major histocompatibility complex range for vaccine presentation (12, 13).