Biochemical and structural studies of the tetragonal crystalline modification of the Escherichia coli elongation factor Tu.

The tetragonal crystalline form of the trypsin-treated Escherichia coli protein elongation factor Tu has been analyzed by biochemical and x-ray crystallographic techniques. The crystals contain two tightly associated polypeptide fragments of molecular weight 36,000 and 6,500 which represent 97% of the native enzyme. The crystals do not contain a short internal polypeptide fragment of 14 amino acids which dissociates from the native enzyme following mild trypsin digestion. The short fragment has been implicated in the aminoacyl-tRNA binding function and its location has been determined. The structure of the modified enzyme in the P4(3)2(1)2 crystal form has been determined to 5 A resolution by x-ray diffraction methods. The protein consists of two domains: the larger domain exhibits considerable alpha helical characteristics and the smaller domain has no identifiable secondary structural features. The relationship between the double domain structure of the enzyme and its biochemical properties is discussed.

host-donated subunits of the replicative complex of QP phage ( 5 ) and it has been reported to have marked similarities to the eukaryotic protein actin (6). Because of the multifunctional character of EF-Tu, its allosteric properties and its ability to form a complex with both nucleic acids and other proteins, it provides a remarkably versatile system for the investigation of molecular structure-function relationships.
The elongation factor Tu is a single polypeptide chain of 44,000 daltons ( 7 , 8)' that may be coded by either of two autonomous genes (9). The amino acid composition shows no unusual features, although tryptic digestion studies indicate that the NH2-terminal polypeptide contains a high proportion of basic residues (10). From partial sequence information (1 I), the positions of the three cysteines, two of which are required for elongation activity (12, 131, are known. The sulfhydryl group at position 100 is essential for aminoacyl-tRNA binding and reacts rapidly with alkylating agents and mercurials. Cysteine 156 is necessary for GDP, GTP, and EF-Ts binding and is protected from N-ethylmaleimide inactivation by any of the three. EF-Tu normally exists as a stable complex with guanosine diphosphate (Kdlrq = 3 X M ) rather than as the labile apoenzyme ( 7 ) . The EF-Tu-GDP form does not interact with aminoacyl-tRNA. The protein factor EF-Ts, however, displaces the tightly bound GDP which is subsequently replaced by GTP, thus facilitating the exchange of the nucleotide coenzymes (14). In the EF-Tu-GTP form, the holoenzyme binds and transports aminoacyl-tRNA to the ribosome (15, 16). Substantial evidence from isotope exchange (17), fluorescence studies (18,191, and electron spin labeling (20) indicates that EF-Tu exhibits a series of conformational changes in response to the binding of these allosteric effectors. Similar transitions are observed in other proteins which bind nucleotide coenzymes (21,22) and, accordingly, one might well expect certain associated structural motifs, such as the nucleotide fold (23), to be present in the EF-Tu molecule.
Mild digestion of EF-Tu with trypsin yields two primary products within 5 min (10, 24). The major fragment (36,000 daltons)' includes all of the cysteine residues and retains most of the functions of the native protein. The NHZ-terminal minor peptide (6,000 to 8,000 daltons)' appears to remain associated with the larger fragment under nondenaturing conditions (25). However, digestion patterns monitored by sodium dodecyl sulfate-polyacrylamide gel electrophoresis indicates that the course of proteolysis is more complex. The present 'Though the molecular weight of EF-Tu from E. coli has been reported to be 42,000 (7) and 47,000 (8), the discrepancy is due to a difference in the techniques for the molecular weight determination rather than a difference in molecular properties or amino acid cornposition. To avoid confusion with other literature reports, the relative mass of the EF-Tu tryptic fragments discusser! herein will be based on the value of 44,000 daltons for the undigested EF-Tu.

Structure of Elongation
Factor Tu study indicates that trypsin digestion of EF-Tu involves multiple cleavage points and a larger number of fragments. The significance of the fragmentation pattern and its implications for the biochemical and structural properties of EF-Tu will be discussed. EF-Tu crystallizes easily from ammonium sulfate (7, 26) but this form cannot be obtained in sufficient size for a threedimensional x-ray analysis. Large crystals grown from polyethylene glycol are known ( 2 3 , but their extreme instability makes a structural study difficult. In addition, three crystal forms of EF-Tu which have undergone brief exposure to trypsin are known (8,29,30), two orthorhombic forms now under study elsewhere (31.32) and a tetragonal form reported by ourselves. Because it has only 1 molecule as the asymmetric unit and has suitable crystal properties, we are analyzing the tetragonal form by x-ray diffraction techniques. The crystals which are used for the present structural determination belong to space group P4:,2,2 with a = 70.4 A and c = 161.5 A. We have found that the asymmetric unit contains the major tryptic fragment and the associated NH2-terminal polypeptide. The trypsin-treated material, however, is lacking an internal 14 amino acid sequence. The results of the structural studies at 5 A resolution and their relevance to the biochemical properties of EF-Tu are presented here.

Crystal
Composition-The proteolysis of EF-Tu with trypsin has been reported by several independent investigators (10, 24). Although there is general agreement concerning the fragmentation pattern after prolonged trypsin treatment, there remain discrepancies in the reported fragmentation of EF-Tu immediately following the initial trypsin cleavage. Because the tetragonal crystals are prepared from trypsintreated EF-Tu, it was necessary to reinvestigate the fragmentation pattern in order to determine the number and sizes of fragment polvpeptides within the crystals. A sodium dodecyl sulfate-polyacrylamide gel electrophoretic analysis of the mild trypsin digestion of EF-Tu with time is shown in Fig. 1. Mild conditions of 1% trypsin/EF-Tu at 0°C were used to slow the rate of trypsin cleavage in order to study the initial proteolytic products.
As seen in Column 1 of Fig. 1, the initial trypsin cleavage of EF-Tu occurs at two, rather than one, sites as previously reported. After 1 min, EF-Tu (44,000 daltons) is cleaved into two major fragments of 37,500 and 36,000 daltons and their associated minor polypeptides, 6,500 and 8,000 daltons, respectively. The immediate appearance of two major fragments of EF-Tu has been observed in other published gel photographs (29,48), but their presence has not been noted nor any significance attached to them. Upon further digestion, a small peptide of 1,500 daltons is removed from both the M, = 37,500 and 8.000 fragments when the second susceptible trypsin site is cleaved. The principle fragments remaining after 5 min are 36,000 and 6,500 daltons and correspond to Fragments A and D designated previously (25). T o be consistent with the fragmentation pattern of four bands, the small peptide of 1,500 daltons must represent an internal peptide sequence rather than a short peptide from either the NH, or COOH terminus. As the deleted peptide is too small to be detected on the polyacrylamide gel system, it could not be stated from the electrophoretic analysis whether the small peptide remains as one complete fragment or is degraded further into several pieces. For clarity, a schematic diagram of the trypsin cleavage sites is presented in Fig. 2 denoting the major fragments as A' (37,500) and A (36,000) and the minor fragments as D' (8,000) and D (6,500). The small internal polypeptide is designated as Fragment F (1,500). Together, Fragments F and D comprise D' and Fragments F and A comprise A'. As the trypsin digestion is prolonged beyond 5 min, Fragment A is converted into Fragments B (23,000) and C (13,000) and Fragment D possibly cleaves into two or more peptides. Our results of prolonged trypsinolysis are in complete agreement with other reports (10, 24, 25.49).
In Column C of Fig. 1, the tetragonal crystalline material is compared to the tryptic digestion pattern. The gel electrophoretic analysis indicates that the tetragonal crystals contain at least two polypeptide fragments, corresponding to Fragment A (36,000) and Fragment D (6,500). It is known that these two fragments are tightly associated and can be separated only under denaturing conditions (25). Although both Fragments A and D may be partially nicked, the percentage of nicking is less than 10% and varies slightly with the preparation of the trypsin-treated material under our mild proteolytic conditions. Together, Fragments A and D represent 97% of the molecular mass of the enzyme. Since it is not possible to detect whether the crystals also contain the remaining 3% of the molecule, namely Fragment F on the present polyacrylamide gel, another analysis was carried out. T o determine whether the tetragonal crystals contained amino acid residues in addition to Fragments A and D, the trypsin-treated material was analyzed prior to crystallization. In a preliminary experiment, protein material after 5 min of trypsin action was dialyzed against water for 3 h. The dialysate was analyzed and found to contain amino acids. In a second experiment, the 5-min digestion product was passed over a Sephadex (3-100 column. The eluent following the ["HIGDP binding activity was pooled and analyzed. The fractions were found to contain a single polypeptide. The fragment was sequenced by Laursen and Duffy4 and found to be identical to residues 66 through 79 (50). The correct sequence4 is Ala*-

Phe-Asp-Glu-~u-Asp-Asn-Ala-Pro-Glu-Glu-Lys-Ala-Argi~.
Since the polypeptide fragment can be removed by dialysis or molecular sieve chromatography, it is not tightly associated with Fragments A and D, which are separated only under denaturing conditions. As the material for crystallization of the tetragonal form is routinely passed over a Sephadex (3-100 column following trypsin digestion, the small peptide is removed prior to crystallization. Thus, the tetragonal crystals contain Fragments A and D only, which represent 97% of the native enzyme molecular weight. The crystals are lacking the small internal polypeptide sequence, Fragment F, containing 14 amino acid residues. 5 A Model of EF-Tu-The molecular envelope of the protein is quite clear in the electron density map. The protein is comprised of two domains, a major one measuring 35 X 40 X 45 A and a minor one measuring 25 X 30 X 35 A. The major domain contains at least seven dense, rod-like regions which are characteristic of a helices at 5 A resolution. As seen in Fig.   4, several pairs of long rods of density form with an interaxis angle of either 20" or 70" between them. Such packing is characteristic of the spatial arrangement of neighboring a helices in other three-dimensional structures, such as sperm whale myoglobin (51.52). Assuming the rods represent helical components, the a helical content of the total protein that could be estimated is 28%, which is only slightly lower than the prediction of 32% based on circular dichroism studies (1). The minor domain contains no identifiable structural features at this resolution.
Although the two domain boundaries are clear in the electron density map, the point of connection between them is less clear. Each domain appears to be almost completely surrounded by solvent with few regions of density extending into the other. The lack of appreciable electron density be-' R. Laursen  tween the domains suggests that there may be only one peptide sequence joining the two regions. Initially, the weak connectivity presented a problem in deciding among four possibilities how the two domains were joined to one another. However, with an additional lower contour of electron density at 10% of the maximum peak height, the connectivity was substantially stronger for one spatial arrangement of the domains than for any of the other three possibilities. Based on this most probable connection, a wooden model of the electron density was constructed and is shown in Fig. 5. Each vertical level represents a 3.2 A spacing. The protein has a relatively compact globular shape, slightly elongated in one direction, with dimensions of 45 x 50 x 75 A. The top view of the model in Fig. 6 illustrates the '.waist" or clefts between the two domains. The solid model was cut along a contour line drawn a t 15% of the maximum peak. This will include some solvent regions. Consequently, the wooden model appears bulkier than the electron density map indicates. In particular, the connectivity between the domains appears quite substantial, whereas there are, in fact, only about four strong connections. If the domains had been joined in one of the other possible combinations, the molecular shape and dimensions of the protein would remain esentially the same. At this resolution, it is not possible to determine the amino acid residues to which the platinum atoms bind. All three platinum sites, however, are located on the surface of the molecule. The Hg atom of the single mersalyl site is probably bonded to cysteine 100, the only amino acid residue of EF-Tu which is reactive with mercurial reagents in the presence of GDP in solution. This sulfhydryl group is located 21 residues from the NH2 terminus of Fragment A and is believed to be in the vicinity of the aminoacyl-tRNA binding site (12, 13). The Hg position, as indicated on the wooden model, is located on the outer boundary of the major domain. The site is also near a loosely associated region of density approximately 35 X 18 X 10 8, which could have been assigned to either one major domain or i t s 2-fold-related neighbor. This region of density is circled in Figs. 5 and 6.

DISCUSSION
It is not uncommon for proteins, otherwise noncrystallizable, to crystallize following mild treatment with a protease. This approach has provided the basis for a wide range of xray diffraction analyses including immunoglobulin fragments (53,54), fibrinogen (55), canavalin (56), and fructose 1,6-diphosphatase (57). Mild proteolytic treatment appears to be a particularly useful approach for obtaining crystals of nucleic acid binding proteins or their complexes, as illustrated by the successes with tRNA synthetase (58). ribonuclease (591, lac repressor (60). and the chromatin nucleosomes (61). The success of this approach probably reflects the presence of a conformationally flexible protrusion that is required for nucleic acid binding but which is also highly susceptible to proteolysis. By application of trypsin to the protein elongation factor Tu, we have successfully produced a crystal form of the enzyme which is suited for three-dimensional x-ray diffraction studies. Moreover, the results of mild trypsin digestion of EF-Tu have proved useful in locating the GDP and aminoacyl-tRNA binding sites of the native enzyme.' Within the present paper, we have also been able to use the mild trypsin cleavage results to assign the major fragments to the protein domains found in our 5 8, resolution structure of EF-Tu. Indeed, controlled proteolysis of a purified enzyme is a simple technique which can provide many useful results for structural studies.
Composition a n d Biological Activity of the Tetragonal Crystals-Using sodium dodecyl sulfate-polyacrylamide gel electrophoretic analysis, we have shown that the tetragonal crystals contain two polypeptide fragments of 36,000 (A) and 6,500 (D) daltons. Fragment A contains the GDP binding site, the three sulfhydryl groups, and the COOH terminus of the native enzyme and is identical to the major trypsin fragment reported by other investigators (10.24). Fragment D contains the NH? terminus of the native enzyme and remains tightly associated with Fragment A under nondenaturing conditions (25). Fragments A and D together represent 97% of the molecular weight of the native enzyme.
We have also determined that a third polypeptide, Fragment F, which is formed within 5 min of trypsin digestion, is not present in the tetragonal crystals. Fragment F is easily dissociated from Fragments A and D by dialysis or molecular sieve chromatography prior to crystallization. We have isolated Fragment F and its sequence has been determined (50):' Fragment F is composed of 14 amino acid residues from alanine-66 to arginine-79 in the NH2-terminal region and represents 3% of the molecular weight of the native enzyme.
Our previously reported assays of the tetragonal crystals indicate that the crystalline material is active in binding EF-TS and GDP but has a reduced affinity for GTP. We were also unable to detect ternary complex formation between phenylalanyl-tRNA and the crystalline material using nitrocellulose filtration or a polypeptide synthesis assay on ribosomes (1). In recent experiments," we have demonstrated that the polypeptide region containing the 14 amino acids of Fragment F is essential for the interaction of EF-Tu with aminoacyl-tRNA. Since the tetragonal crystalline material is lacking the Fragment F polypeptide, the inability of the crystals to form a ternary complex is consistent with these findings on the small peptide.
Correlation of Lou1 Resolution Model of EF-Tu with Biochemical Studies-There is general agreement that the major tryptic fragment, A, is cleaved into two smaller fragments, R and C, of 23,000 and 13,000 daltons, respectively. Fragments B and C remain tightly associated under nondenaturing conditions and are not susceptible to further cleavage under mild trypsin digestion conditions. Fragment A is composed of two distinct domains, each of which is relatively compact and thus resistant to further degradation. The 5 8, model clearly reveals that EF-Tu is comprised of two distinct domains. We believe that the larger structural domain includes Fragment R and the minor domain corresponds to Fragment C. Our proposed assignments are consistent with other biochemical studies. By application of the Chou-Fasman rules, the sequence of Fragment B is predicted to contain long stretches of n helices alternating with p sheet structure (11). Circular dichroism studies on the tryptic fragments also confirm that the a helical structure present in the native enzyme is located predominantly within Fragment B.5 In the low resolution model, the larger domain is characterized by approximately seven long rod-like densities, probably representing a helical regions. The circular dichroic spectrum of Fragment C, on the other hand, exhibits no features immediately correlated with known types of secondary structure. Its sequence is predicted to be primarily composed of / 3 structure which would not be apparent in a low resolution electron density map. The minor domain of our model does not appear to contain a helical regions or other pronounced secondary structural features. Furthermore, Fragment B is known to contain all three sulfhydryl groups of native EF-Tu (25). Only one thiol group, cysteine 100, is reactive toward mercurial reagents in the presence of GDP in solution. From the structural results, it is clear that the mersalyl derivative has substituted at only one site in the presence of excess GDP and, from the known chemical reactions of mersalyl, the Hg site is very likely to be cysteine 100 in Fragment B. The single Hg site is located on the outer boundary of the large domain as indicated in Fig. 5.
The unambiguous identification of the NH, terminal fragment, D, is not possible at 5 A resolution. However, we believe that Fragment D is more likely to be associated with Fragment B in the large domain than with Fragment C. Since Fragment D is known to remain tightly associated with Fragment A in solution and in several crystalline forms, very strong intramolecular interactions probably exist between the two polypeptides. These same strong interactions are likely to be present in the intact native enzyme as well and are not likely to be disrupted by trypsin cleavage at two specific exposed sites. In the intact native enzyme, the sequence corresponding to Fragment D is positioned 14 amino acids from the Fragment B sequence. Because of the sequence proximity and the unlikelihood of a positional shift of Fragment D following trypsin cleavage, we believe that the large domain includes Fragment D as well as the amino acid residues corresponding to Fragment B, while the smaller domain is the Fragment C region. Indeed, the ratio of the estimated volumes of the major to the minor domain is in very good agreement with the molecular weight ratio of Fragments B and D (29,500) to Fragment C (13,000).
Comparison of the Tetragonal Structural Results with the Orthorhombic Forms--In addition to the tetragonal form, two other crystalline modifications of cleaved EF-Tu are known and are under analysis by x-ray diffraction techniques (31,32). The unit cells of all three crystalline forms can be interrelated and it is likely that the three forms are nearly identical in the gross structural features of the enzyme. However, the finer details of the molecular structure are likely to be different in each crystal form and they may represent different conformational states of EF-Tu related to its function in protein biosynthesis. In comparing our model to the published low resolution structures of others, we find ourselves in considerably better agreement with the P212121 model (31) than with that of the C2221 model (32).
From the low resolution structure derived from the P212121 crystals, the investigators find that the trypsin-treated enzyme is composed of two domains in which the larger one, "the head," exhibits considerable N helical characteristics at 6 A resolution. The protein dimensions of 75 X 50 X 35 A and the overall structural features are in general agreement with the P4&?12 model. The relative connection of the two domains also appears to be identical. In fact, the present model can be easily superimposed upon the P212121 model ( Fig. 3; Ref. 31) by a 90" rotation of the view in Fig. 5 (top). The primary discrepancy is the large "hole" between the domains of the P21212~ model which is not evident in the p4:,z12 model. It is conceivable that the two clefts between the domains in the present model would appear as a hole if some electron-dense regions were weaker. On the other hand, the hole in the P212121 form could also be a result of artifacts introduced by the molecular averaging calculations which were applied. Although the molecular averaging technique is useful for excluding background noise, it could also eliminate meaningful electron density if the localized conformations were significantly different between the pseudo 2-fold-related molecules. These discrepancies at low resolution are relatively minor and will certainly be clarified by the high resolution structures.