The amino acid sequence of Escherichia coli cyanase.

The amino acid sequence of the enzyme cyanase (cyanate hydrolase) from Escherichia coli has been determined by automatic Edman degradation of the intact protein and of its component peptides. The primary peptides used in the sequencing were produced by cyanogen bromide cleavage at the methionine residues, yielding 4 peptides plus free homoserine from the NH2-terminal methionine, and by trypsin cleavage at the 7 arginine residues after acetylation of the lysines. Secondary peptides required for overlaps and COOH-terminal sequences were produced by chymotrypsin or clostripain cleavage of some of the larger peptides. The complete sequence of the cyanase subunit consists of 156 amino acid residues (Mr 16,350). Based on the observation that the cysteine-containing peptide is obtained as a disulfide-linked dimer, it is proposed that the covalent structure of cyanase is made up of two subunits linked by a disulfide bond between the single cystine residue in each subunit. The native enzyme (Mr 150,000) then appears to be a complex of four or five such subunit dimers.

The amino acid sequence of the enzyme cyanase (cyanate hydrolase) from Escherichia coli has been determined by automatic Edman degradation of the intact protein and of its component peptides. The primary peptides used in the sequencing were produced by cyanogen bromide cleavage at the methionine residues, yielding 4 peptides plus free homoserine from the NH2terminal methionine, and by trypsin cleavage at the 7 arginine residues after acetylation of the lysines. Secondary peptides required for overlaps and COOH-terminal sequences were produced by chymotrypsin or clostripain cleavage of some of the larger peptides. The complete sequence of the cyanase subunit consists of 156 amino acid residues (Mr 16,350). Based on the observation that the cysteine-containing peptide is obtained as a disulfide-linked dimer, it is proposed that the covalent structure of cyanase is made up of two subunits linked by a disulfide bond between the single cystine residue in each subunit. The native enzyme (Mr 150,000) then appears to be a complex of four or five such subunit dimers.
The enzyme cyanase (cyanate hydrolase EC 3.5.5.3) has been reported to be present in animal tissues (l), plants (2, 3) and bacteria (4,5). In a recent investigation of this enzyme it was not possible to confirm its presence in animal tissues (6), but as reported (5) the enzyme was found to be present in Escherichia coli, and after induction with cyanate it could be isolated and characterized ( 7  Tables I-XIII, and additional references) are presented in miniprint at the end of the paper. The

RESULTS AND DISCUSSION
Summary and Evaluation of the Sequence Data-The complete structure of E. coli cyanase is given in Fig. 1. An examination of the fragments used in deducing the sequence should establish that the reported sequence is documented in terms of peptide overlaps and that it thus is unlikely that segments may have been missed. The weakest point in this regard is the sequence Ala-58-Arg-59-Leu-60 (the COOH terminus of a chymotryptic peptide, C,) in which the single Leu residue establishes the connection between the two Arg peptides R3 (12-59) and R4 (60-81). The sequence of CNBr peptide Mq (95-100) was determined but with ambiguous results because of the combination of Glu and homoserine a t the COOH-terminal end; these two amino acids were not resolved by amino acid analysis and the carboxypeptidase treatment consequently gave equivocal results. The ideal way to solve this problem would be to sequence peptide R: (97-141), but we were unable to achieve a satisfactory separation of the two large Arg peptides RJ and Ri. Since R3 had been obtained pure from the large CNBr peptide Ms, however, and its sequence had been established, R3 and R7 were sequenced as a mixture, and the initial 8 residues of R7 were thus deduced to establish the important overlap of M4, R7, and M,. Another part of the sequence, the carboxyl terminus, is based on minimal direct sequence data but is supported by carboxypeptidase A release of a single Phe from both the intact enzyme and the COOH-terminal peptides Ms and R,; the amino acid analyses of the two COOH-terminal peptides exclude the presence of an additional short peptide terminating in a second Phe residue. Some useful peptides were obtained by serendipity from unexpected enzyme cleavages. Thus, our preparation of clostripain yielded several unpredicted peptides, including one from MS which could only result from a cleavage between Ala-46 and Leu-47 and which along with the chymotryptic peptide 49-59 established the COOH-terminal segment of Rs.
As in previous work (8,9) we used strong acid for many of the peptide fractionations, and the identification of Asn and Gln could consequently be ambiguous if the exposure to acid was sufficient to hydrolyze the amides. Another similar problem is related to our finding low Tyr yields in CNBr peptides both on amino acid analysis and sequencing. Because of these concerns, we collected a good deal of redundant sequence data as a means to eliminate these possible ambiguities. As a rule with the acid-treated peptides we assigned any residue which had a readily identifiable amide component (10-

S O
-+t-6 0 A l a -P h e -V a l -T h r -A l a -A l a -L e u -L e u -G l y -G l n -G l n -A l a -L e u -P r o -A l a -A s p -A l a -A l a -A r g -L e u -B O 1 0 0

A r g -G l y -C y s -l l e -A s p -A s~-A r g -l l e -P r o -T h r -A s~-P r o -T h r -M e t -T y r -A r Q -P h e -T y r -G l u -M e t -
-"" -"" --" " --" " " -"" " " _" " "" -'9 +R w 6

6
-----"""_ -" " ""- Asx or Glx) to Asn or Gln, and in many cases these could be confirmed by high yield of amides in peptides that had been treated more gently. It is our experience that one of the greatest sources of variation in automatic sequencing with the model 890C sequencer without a condenser (cold trap) for the vacuum system is the rapidly declining quality of the pumps. Even with frequent exchange with rebuilt pumps, both the degree of evacuation and the rate with which it is achieved vary over reIatively short periods of time, and the repetitive yields were consequently also subject to substantial variation (87-9756). Frequent monitoring of the sequencer performance was carried out, and most of the reported data were obtained with repetitive yields from 93 to 97%.
Implications of the Sequence on Cyanase Structure-It may be interesting to note that in assessing the cyanase sequence according to Levitt's (10) statistical analysis of conformational preferences of amino acids in globular proteins, no sequence of more than 4 amino acids favoring (or neutral to) either a-helix, sheet, or reverse-turn structures was found.
One aspect of the data warrants detailed consideration here, namely the role of the single Cys residue in cyanase. In the original report on E. coli cyanase (7) the absence of any free sulfhydryl groups was noted. This feature was established by titration with 5,5"dithiobis-(2-nitrobenzoate) before and after treatment with sodium dodecyl sulfate or in the presence of 6 M guanidine hydrochloride or 8 M urea. Since each subunit contains only a single Cys residue, the most plausible explanation for the absence of free sulfhydryl groups is that pairs of subunits are covalently linked through disulfide bonds, and that the covalent structure of cyanase thus is a dimer. Consistent with this picture is the finding (Fig. 2 in Miniprint) that the CNBr treatment of cyanase yielded a substantial amount (50-70%) of the peptide Ma (78-94, containing the single Cys-83 residue) as a dimer which could be converted to the monomer by performic acid oxidation. We assume that under the acid conditions of the CNBr treatment and the subsequent peptide separation, no disulfide formation would

Amino Acid
Sequence of E. coli Cyunase take place, and that the dimeric peptide thus represents a structure preformed in the intact, active enzyme. With this model the 150,000-dalton enzyme should have the general structure (P-S-S-P),, where P represents the cyanase subunit of M, = 16,350, and n has a value of either 4 or 5.