Formaldehyde-induced Cross-Linkages in the a Subunit of the Escherichia coli Tryptophan Synthetase”

SUMMARY Two intramolecular cross-linkages (methylene bridges) have been’ introduced by formaldehyde in the a! subunit of the Escherichia coli tryptophan synthetase without any detectable loss in enzyme activity. These have been assigned to side chains in the sequences between residues 152 to 158,212 to 218, and 227 to 233. Based on the conditions of reaction, the stability of the methylene bridges, and the availability of reactive residues in these sequences, the specific amino acid pairs linked together have been tenta-tively designated as asparagine-156-serine-214 and glut-amine-218serine-232. relationships utilizing both enzyme com-ponents and pz

to 218, and 227 to 233. Based on the conditions of reaction, the stability of the methylene bridges, and the availability of reactive residues in these sequences, the specific amino acid pairs linked together have been tentatively designated as asparagine-156-serine-214 and glutamine-218serine-232.
Structure-function relationships utilizing both enzyme components (the OL and pz subunits) of the Escherichia coli tryptophan synthetase have been the subject of investigation in several laboratories (l-6). Attention here has been focused on the a! subunit of this enzyme complex.
Chemical modification studies, primarily with sulfhydryl group reagents (l-3), and some limited observations of mutationally altered proteins (4) have suggested discrete conformations in certain regions of the active enzyme.
More recently, a further examination of the threedimensional structure of this enzyme has been initiated with bifunctional reagents; the results obtained with bis(maleimidomethyl)ether (7) and 1,5-difluoro-2,4-dinitrobenzene,l reagents which react with the critical cysteinyl residues, have confirmed several suggestions which arose from the initial chemical studies.
In an effort to determine any functionally significant conformation in areas other than the cysteinyl region, formaldehyde, a bifunctional reagent which is less specific for these residues in this protein under certain conditions, was used.  Edsall in 1945 (8) and subsequent studies by  have extended the investigation of formaldehyde as a cross-linking reagent both in model systems and with proteins. The primary reaction with many proteins appears to be the formation of an aminomethylol derivative with the e-amino group of lysyl residues.
The cross-linking reaction involves the condensation of this group with an active hydrogen on primary amides, guanidyl, phenolic, and imidazolyl groups to form a methylene bridge.
In addition, these and other side chain groups (e.g. aliphatic alcohols and thioalcohols) are potentially capable of initiating the cross-linking reaction (8). The conditions for methylene bridge formation vary for the different side chains, although most can react at neutral to alkaline pH and at moderate temperatures.
Thus, formaldehyde should be readily applicable for enzyme modification studies, and this report presents the results obtained from an examination of the reaction of this reagent with the a! subunit.

MATERIALS AND METHODS
Substrates and Reagents-The preparation of indoleglycerol phosphate and 14C-indoleglycerol phosphate was described previously (1). l4C-Formaldehyde and uniformly labeled 14C-L-lysine were obtained from New England Nuclear. Trypsin-n-l-tosylamido-2-phenylethyl chloromethyl ketone, chymotrypsin, and pepsin were purchased from Worthington.
Enzyme Preparations-The (I! subunit was purified according to the method of Malkinson and Hardman (3). A crude extract of E. coli strain A2, which contains a defective a! subunit, was used as the source of the pZ subunit.
All assays for enzymatic activity have also been described (3). Analytical Methods-Sulfhydryl group content was determined in 4 to 6 M urea with 5,5'-dithiobis (2-nitrobenzoic acid) according to the method of Malkinson and Hardman (3). Protein was assayed by the method of Lowry et al. (14). Molecular weight estimations were based on the sodium dodecyl sulfatepolyacrylamide disc electrophoresis method of Weber and Osborn (15).
Polyacrylamide disc electrophoresis methods have been described before (3). Gel tracings were made with the linear transport attachment to the Gilford recording spectrophotometer.
No additional radioactivity could be removed after this dialysis. Subsequently, a similar dialysis was performed in the absence of dimedon to remove this reagent.
The extent of reaction for each sample was calculated from radioactivity and protein measurements. These were the conditions used in all of the experiments described.
The only variations were in the volume of the reaction Lower curve (0) represents the level of bound formaldehyde released by acid hydrolysis and distillation; lower curve (X-X) represents the level of 14C-lysine bound (moles per mole of protein) to samples treated with unlabeled formaldehyde. mixture, the length of incubation, and the omission of dithiothreitol from the dialysis buffer when sulfhydryl group content was determined.
The volume of the reaction mixture varied, depending on the amount of sample needed for analysis.
For the experiments described in Figs. 1 to 4, l.O-ml volumes were used; for peptide analyses, 30-to 60-ml (15 to 30 mg of protein) volumes were required.
Peptide Analysis of Formaldehyde-treated Protein-Prior to digestion for peptides, the protein was lyophilized and the cysteinyl residues were carboxymethylated with iodoacetate according to the procedure of Hardman and Yanofsky (1). This procedure alkylates only the cysteinyl residues.
Tryptic digestion was performed as described by Helinski and Yanofsky (16). After removal of urea and salts from the digest by filtration through a Sephadex LH-20 column at 4", the radioactive peptides were isolated by preparative paper electrophoresis and chromatography (16).
The radioactive bands were detected with a Packard radiochromatogram sca,nner and eluted with water. All of the peptides which were analyzed were isolated and determined to be pure by these techniques.
Amino acid analyses were performed with the Beckman model 120B amino acid analyzers of the Department of Biology service facility. Chymotryptic and peptic digestion of the isolated tryptic peptide were carried out by the methods of Helinski and Yanofsky (16) and Guest and Yanofsky (17), respectively, and the resulting radioactive peptides isolated as above.
The dimethylaminonaphthalene sulfonyl chloride end group procedure was performed as described by Gray (18) and the dimethylaminonaphthalene sulfonyl-amino acids were identified according to the procedure of Morse and Horecker (19).

RESULTS
Extent and Nature of Reaction of OL Subunit with Formaldehyde-Although no extensive study was made regarding the effect of pH on the reaction, preliminary experiments indicated very little reaction (through 48 hours of incubation) at pH 7 in phosphate buffer.
Conditions of lower pH were avoided since the enzyme becomes progressively inactive below pH 6. At pH 8.2 (1% sodium bicarbonate), however, substantial reaction occurs and continues linearly from 4 through 24 hours ( Fig. 1). During this time, approximately 4 moles of formaldehyde are bound and the pattern of labeling appears to be very reproducible.
Longer incubations, up to 40 hours, indicate that 15 to 20 moles of formaldehyde could be bound.
In view of the difficulties that were anticipated in determining the sites of reaction at this level of labeling, efforts were directed toward the initial labeling process (4-to 7.hour period), represented by the lag period in Fig. 1. It is seen in Figs. 1 and 2 (upper curves) that even at very short incubation times there was substantial binding. The value of 0.6 to 0.8 mole of formaldehyde bound at "0 hours" incubation represents that reaction which occurs during the time involved in setting up the dialysis and the initial period of dialysis (see "Mat.erials and Methods"). An increase in net binding to about 1.2 to 1.3 moles occurs during the first 4 to 7 hours.
In all of the subsequent experiments, the reaction was terminated after 43 hours.
In larger scale reaction mixtures, the enzyme preparations routinely contained 1.2 to 1.5 moles of formaldehyde per mole of enzyme.
The finding that the level of bound formaldehyde in all preparations was always larger than unity indicates that some fraction of the protein molecules had more than 1 molecule of formal-dehyde.
The properties of formaldehyde-treated protein on analytical polyacrylamide gel disc electrophoresis further indicate a heterogeneous mixture of protein molecules.
The electrophoretic pattern shows that about 80 to 85% of the protein migrates substantially faster than untreated, control enzyme preparations, suggesting that on this basis, at least, nearly all of the protein has reacted with formaldehyde. Unfortunately, attempts to resolve the different fractions by preparative polyacrylamide electrophoresis were unsuccessful, all of the protein eluting as a single, large diffuse band.
For this reason, the characterization (molecular weight and enzymatic and peptide analyses) of the treated protein was performed on total unfractionated preparations.
Despite this drawback, however, these analyses provided relatively clear-cut answers.
The peptide analysis, in particular, provides supporting evidence that a major portion (>50%) of the reacted protein molecules contain 2 molecules of formaldehyde.
Molecular Weight Estimations-Although the protein concentration was kept relatively low in an attempt to minimize any intermolecular cross-linking, it was important to establish this point.
Molecular weight estimation by the sodium dodecyl sulfate-polyacrylamide method (15) was used. The position of each protein in the mixture was established in separate gels. There appears to be essentially no difference in mobility between the treated and untreated protein.
Molecular weight estimations of the formaldehyde-protein based on the mobilities of the control proteins indicate a molecular weight identical with the normal a! subunit. Activity Measurements- Fig.  4 shows the results of activity measurements on the enzyme which was treated with formaldehyde for periods of time up to 43 hours. Fig. 4A shows the activity of a! subunit alone in the reaction, indoleglycerol phosphate to indole; Fig. 4B shows the activity of the OL subunit in the presence of the PZ subunit (the indole plus serine to tryptophan reaction).
In neither case is there substantial activity loss. Thus, little if any of the formaldehyde, which reacts either monofunctionally or bifunctionally, appears to alter any side chain or conformation which is essential for normal functioning of the enzyme.
Bifunctionality of Reaction-In order to determine whether formaldehyde had formed a cross-linkage or was present as the monomethylol derivative, several techniques were used. The monomethylol derivatives of many amino acids are labile to conditions of acid hydrolysis; the bound formaldehyde is liberated by distillation from acid (8,9). Accordingly, aliquots of the reaction mixture were removed, dialyzed as described, and hydrolyzed for 24 hours in 6 N HCl at 110" in a vacuum. Subsequent distillation of these samples into dimedon indicated that only about 10 to 20% of the radioactivity was lost from the hydrolysate.
These data are shown in the lower curve ( l ) in Fig. 2 to the formaldehyde-treated enzyme.
After the initial treatment with unlabeled formaldehyde, a second 4)-hour incubation, under similar conditions, was performed in the presence of 14C-lysine. Each sample was then redialyzed as before and analyzed for bound lysine.
The concentration of lysine represented a lOOOfold excess over the amount of bound formaldehyde in these samples. As can be seen in Fig. 2   TP-29: His-Asn-Val-Ala-Pro-Ile-Phe-Ile-Cys-Pro-Pro-Asrl-Ala-Asp-Asp-Asp-Leu-Leu-Arg TP-13: Glu-Tyr-Asn-Ala-Ala-Pro-Pro-Leu-Gln-Gly-Phe-Gly-0 Ile-Ser-Ala-Pro-Asp-Gin-Val-Lys TP-6: Ala-Ala-Ile-Asp-Ala-Gly-Ala-Ala-Gly-Ala-Ile-Ser- However, these data do agree well with the other monofunctional assay, and, furthermore, the peptide analyses presented below strongly suggest that at least 65 to 70% of the formaldehyde is bound as a methylene bridge structure. Localization of Sites of Formaldehyde Cross-Linkages-The acid stability of the bound formaldehyde suggested cross-linkages other than those involving lysine-arginine or lysine-amide side chains (9, 10). In addition, sulfhydryl group titrations (Fig. 4C) indicated that none of the cysteine residues had reacted. Since tryptic peptides are readily identifiable and since limited digestion by trypsin due to blocked lysine or arginine residues was not expected, tryptic digestion was attempted initially to localize the cross-linkages.
After tryptic digestion, the urea and digestion buffer were removed by Sephadex LH-20 filtration.
Four peaks (A, B, C, and D) of radioactivity were found (Fig. 5). Peaks A and B represent 65 to 70% and 20%, respectively, of the total radioactivity.
Peak A yielded seven to eight ninhydrin-staining bands on preparative paper electrophoresis. Of these, only one was radioactive (mobility, 0.28 to 0.32, relative to the quinine sulfate marker (16) ; this band contained approximately 90% of the total radioactivity).
This material was eluted from the paper and tested for homogeneity by re-electrophoresis and paper chroma-tography (16). A smgle mnhydrm-stammg component was found with both techniques (mobility, 0.30 on electrophoresis; Ra, 0.15 in the descending chromatography system (16)). Peak B also contained seven to eight ninhydrin-staining bands on preparative paper electrophoresis. Again, only one major radioactive component was found, containing 290% of the radioactivity.
The amino acid composition of this band readily established its purity.
The remaining peaks (C and D) contained a total of about six to seven ninhydrin-staining components. In none of these, however, was the level of radioactivity high enough to be clearly distinguished from background. Because of this and the fact that 85 to 90% of the labeled peptides had been accounted for in Peaks A and B, the latter two fractions (C and D) were not studied further.
The amino acid analyses of the radioactive peptides from Peaks A and B are shown in Table I. Tryptic peptide A (from Sephadex Peak A) contained both lysine and arginine in a 2: 1 ratio, suggesting that it consisted of three normal tryptic peptides. Calculations were based on these molar ratios. A rigorous comparison with all of the known tryptic peptides in various combinations indicated that it, contained TP-29 (residues 145 to 163), TP-13 (residues 201 to 220), and TP-6 (residues 221 to 238). These are the only possible tryptic peptides whose total composition is compatible with that found for this labeled peptide. The compositions of TP-29, TP-13, and TP-6 are shown for comparison in Table I and the sequences of these peptides are given in  the legend to the table. Tryptic peptide B (the radioactive band from Sephadex Peak B) was easily recognized as being identical with TP-7 (residues 256 to 262).
No further work was done with this peptide since any cross-linkage which may be present The fact that the composition of tryptic peptide A included three normal peptides suggests several possibilities for its structure.
It could represent, in addition to a single labeled peptide, the other peptides as contaminants.
This possibility seems unlikely since the constituent peptides (TP-29, TP-13, and TP-6) are readily separable by either the electrophoresis or chromatography systems.
The Rp values for TP-29, TP-13, and TP-6 are 0.27, 0.10, and 0.00, respectively. These values were obtained in separate experiments and agree well with those obtained from published peptide maps (20). Furthermore, if any of these peptides were not covalently linked together, the molar ratios of the residues found in tryptic peptide A (and in all of the chymotryptic and peptic peptides derived from it) would require that the nonlinked peptide had eluted in a molar ratio which is nearly identical with that of the linked peptide.
From these considerations, it would appear that all three peptides are linked together.
This possibility raises the question of whether one or more cross-linkages are present in this peptide.
Peptides TP-13 and TP-6 are contiguous in the amino acid sequence (21) and their presence together may simply represent the fact that the lysine-alanine bond (residues 220 and 221) between these two peptides was poorly cleaved by trypsin. This would mean that either TP-13 or TP-6 would be cross-linked to TP-29 by a single methylene bridge. End group analysis by the dimethylaminonaphthalene sulfonyl chloride technique (18) indicated the presence of three amino-terminal residues, histidine, glutamic acid, and alanine. This finding helps substantiate the identification of the peptides and further indicates that the lysinealanine bond between TP-13 and TP-6 was cleaved by trypsin. In addition, the specific radioactivity was determined for the preparation of tryptic peptide A whose analysis is presented in Table I. The moles of i4C-formaldehyde per mole of lysine, arginine, and histidine were 1.06, 2.30, and 1.99, respectively. This gives an average value of 2.14 moles of 14C-formaldehyde per mole of tryptic peptide A. Several other preparations of this peptide, in which the specific radioactivity was measured, contained 1.76 to 1.94 moles of 14C-formaldehyde per mole of peptide. These observations strongly suggest that tryptic peptide A consists of three distinct and separate peptides, which are linked through two methylene bridges. Subsequent digestion of tryptic peptide A with chymotrypsin and pepsin was performed in order to verify this point and to narrow further the areas involved in the cross-linkages. The amino acid analyses of the peptic peptides obtained from tryptic peptide A are shown in Table III, together with the composition of those pertinent areas of tryptic peptide A indicated in the footnotes to the table.
The relatively large size of these fragments and their poor yieId indicates minimal digestion by pepsin. Attempts to reduce the size of the fragments in the area of the cross-linkages by longer incubation with pepsin have been unsuccessful; no new radioactive peptides were produced.
Chymotryptic digestion produces four to five peptides, only one of which was radioactive; pepsin produces six to seven peptides, of which three were radioactive-two were in equivalent although poor yield, and the third in about one-half the yield of the other two.
The purity of these peptides was determined as described above for tryptic peptide A. The amino acid analyses of the chymotryptic fragment obtained from two preparations of tryptic peptide A are given in Table II and are consistent with  the composition presented for those areas of tryptic peptide A indicated in Footnote a of Table II. Residues 152 to 162 and 212 to 238 represent, respectively, chymotryptic peptides CP-49 (22) and A 169 CP-M (minus that portion which had been cleaved by trypsin at lysyl residue 238 (21). Specific radioactivity measurements on one of these chymotryptic preparations showed It is apparent, however, from the analyses of these fragments and the previous data on tryptic peptide A, that there is more than one cross-linkage between the three tryptic peptides. A summary of all of the peptide sequences is shown in Fig. 6. The presence of TP-29 (residues 145 to 163), which is widely separated in the amino acid sequence from TP-13 and TP-6 (residues 201 to 238)) clearly implicates this peptide on one side of the cross-linkage. The absence of residues 219 to 220 in peptic peptide I, of residues 219 to 223 in peptic peptide II, and of residues 221 to 226 in peptic peptide III indicates that two additional and separate sequences are involved.
End group analyses of the tryptic peptide also suggested that three separate peptides were involved.   Fro. 6. Summary of the labeled tryptic, chymotryptic, and peptic peptides found in Sephadex Fraction A. The chymotryptic fragment is shown in italics; the peptic peptides I, II, and III are indicated by solid, dashed, and dotted lines, respectively.
The regions common to all of the peptides are shown enclosed. and B. It was observed that about 807, of the protein had reacted.
If, in these molecules, 7070 of the formaldehyde was found in doubly labeled protein (tryptic peptide A), and the remainder in singly labeled protein (tryptic peptide B), a net of about 50% of all of the protein was doubly labeled and 30yo was singly labeled.
This results in a net of 1.4 moles of formaldehyde per mole of enzyme.

DISCUSSION
The results presented here indicate that formaldehyde has introduced two major intramolecular cross-linkages in the (Y subunit.
These cross-linkages can be assigned to residues linking tryptic peptides TP-29, TP-13, and TP-6. The composition of the chymotryptic and peptic peptides are consistent with this conclusion; all can be considered as a family of peptides obtained from these three tryptic peptides.
On the basis of the composition of all of these peptides, the sequences involved in the cross-linkages can be narrowed further. A tentative assignment of these regions is to the sequences between residues 152 to 158, 212 to 218, and 227 to 233. The assignment of the amino-terminal residues for these sequences is based on the composition of the chymotryptic fragment for residue 152, the chymotryptic peptide and peptic peptide III for residue 212, and peptic peptide III for residue 227. The assignment of the carboxy-terminal residues is based on the composition of peptic peptide II for residue 158, peptic peptides I and II for residue 218, and all three peptic peptides for residue 233. The areas common to the sequences of all of the peptides are shown in Fig. 6.
It is possible that the sequence of residues 152 to 156 can also be eliminated because of the composition of peptic peptide III. However, this assumes peptic cleavage on the carboxy side of a prolyl residue. This is unusual, although the same assumption can be made for peptic peptide II; both require a splitting after a prolyl-proline sequence.
For this reason, these amino-terminal portions of these peptic peptides camlot be considered as conclusive.
The absolute yields of all of the peptic peptides were very low and precluded any reliable amino or carboxy-terminal analyses.
Of some importance, also, with regard to these assignments is the designation of residue 233 as a carboxy-t.erminal residue.
The inclusion of an additional seryl residue (residue 234) in these sequences would compromise the conclusions discussed later.
This possibility does not seem too likely in view of the fact that serine, which is reportedly susceptible to some loss during acid hydrolysis, 1la.s always been recovered in 85 to 90% yield here.
Furthermore, the analyses of all three peptic peptides is consistent with the exclusion of an additional seryl residue.
The designation of the specific residues linked by the methylene bridges cannot be made unequivocally at the present time since the isolation of only those residues involved was not possible. On the basis of a number of observations, however, nearly all of the possible cross-linkages can be eliminated. The exclusion of lysyl, arginyl, tyrosyl, histidyl, and cysteinyl side chains is certain, primarily on the basis of the peptide analyses and the sulfhydryl group titrations. Indolyl groups need not be considered since the a! subunit contains no tryptophan.
Of the residues contained within the sequences mentioned above, serine, aspartic acid, glutamine, and asparagine are possibilities at least (8). Several of these would appear unlikely, however, when considering (a) the reaction conditions used (pH 8.3, aqueous medium), (b) the stability of the bound forma,ldehyde to conditions of acid hydrolysis and distillation, and (c) the fact that two cross-linkages must be present in the available sequences.
The stability of formaldehyde bound in this type of methylene bridge is not clearly defined, although both threonine and serine have been shown to combine formaldehyde with phenolic and imidazole compounds in an acid-stable manner (13). If such cross-linkages are present in the peptides isolated, they would probably be between asparagine-156 and serine-214 and between glutamine-and serine-232. These assignments are, of course, based on the exclusion of serine-234.
If this residue were present, a number of other amide-seryl cross-linkages would be possible.
Regardless of the absolute accuracy of the assignment of the specific cross-linkages, however, it is clear that these regions of the protein are bound together by formaldehyde and are probably closely aligned in the three-dimensional structure of the active enzyme.
In addition, activity measurements suggest that these crosslinkages do not distort the conformation or hinder access of the substrate, indoleglycerol phosphate, to the a subunit when it is acting alone.
Neither is there any indication of an impairment in the ability of the treated a! subunit to combine functionally with the pz subunit.
Both findings (i.e. the sites of the crosslinkages and the recovery of activity) are consistent with structure-function studies using other bifunctional reagents' (7) which indicate that this area of the protein is substantially removed from the site of substrate interaction.