Partial NH2- and COOH-terminal Sequence and Cyanogen Bromide Peptide Analysis of Escherichia coli sn-Glycerol-3-phosphate Acyltransferase*

The sn-glycerol-3-phosphate acyltransferase from Escherichia coli, an integral membrane protein whose activity is dependent on phospholipids, was purified to near homogeneity (Green, P. R., Merrill, A. H., Jr., and Bell, R. M., (1981) J. Biol. Chem. 256, 11151-11159). Determination of a partial NHz-terminal sequence and the COOH terminus permitted alignment of the polypeptide on the sequenced sn-glycerol-3-phosphate acyltransferase structural gene (Lightner, V. A., Bell, R. M., and Modrich, P. (1983) J. Biol. Chen. 258, 10856-10861). Processing of the sn-glyc- erol-3-phosphate acyltransferase is apparently limited to the removal of the NHz-terminal formylmethionine. Thirteen of 27 possible cyanogen bromide peptides predicted from the DNA sequence were purified, char- acterized, and assigned to their location in the primary structure. Three peptides located at positions through- out the sequence were partially sequenced by automated Edman degradation. The partial sequence anal- ysis of the homogeneous sn-glycerol-3-phosphate acyltransferase is fully in accord with the primary structure inferred from the DNA sequence.

In the accompanying paper, the nucleotide sequence of a 3865-base pair fragment of DNA containing the structural gene is reported (15). One open reading frame sufficiently long to code for the glycerol-P acyltransferase has been identified. The amino acid composition predicted by the sequence closely resembled that determined for the highly purified glycerol-P acyltransferase ( 5 ) .
Partial amino acid sequence data and the isolation and characterization of 13 of 27 predicted cyanogen bromide peptides are reported here. These data permit unambiguous alignment of the polypeptide within the structural gene and establish the primary structure of the glycerol-P acyltransferase.

RESULTS
NH2-and COOH-terminal Sequence Determinations-To align the glycerol-P acyltransferase polypeptide within its sequenced structural gene (15), the NH,-terminal sequence was analyzed by automated Edman degradation. Analysis of 25 nmol of glycerol-]? acyltransferase established the sequence X-Gly-Trp-Pro-X-Ile-Tyr-Tyr-Lys-Leu (Table I). An increasing background limited quantitative Edman degradation to 10 cycles. The sequence for a small amount of the protein began with methionine. The stagger detected indicates that while most of the protein is processed by cleavage of the NH2terminal methionine, a t least a portion (10-20%) of the molecules arises by removal of the formyl group only. Serine and arginine were not detected. A repetitive yield of 96% was calculated from the recovery of isoleucine at cycle 6 and leucine at cycle 10 of the major sequence. If the recovery of isoleucine is corrected for the 96% repetitive yield in cycles 6 and 7 of the major and minor sequences, respectively, an 80% first cycle yield is obtained.
The COOH-terminal residue was investigated by hydrazinolysis of glycerol-P acyltransferase. Only 0.4 mol of glycine/ mol of protein was released. No amino acids were detected in the blank. Egg white lysozyme was used as a positive control,  Analysis of cyanogen bromide peptides: observed and predicted amino acid compositions and NH, termini Peptides were hydrolyzed in uacuo at 105 "C for 24 h. A crystal of phenol was added to protect tyrosines. Values represent number of amino acid residues found with the number of amino acid residues predicted in parentheses. NHn-terminal residues were determined bv the double couding Edman degradation method of Chan 'All NHP termini found match predicted termini. The second amino acid for peptide 606-615 matches the predicted second one.
Total nanomoles of peptide recovered as measured by amino acid analysis/330 nmol of cleaved protein.  and as expected, leucine was released and recovered at 0.17 analyses were used to identify the isolated peptides (Table  mol/

11). Except for glycine which was consistently found in slightly
Peptide Analysis-To establish the primary structure of the larger amounts, the values for the amino acids agreed with glycerol-P acyltransferase, analysis of the peptides produced those inferred from the DNA sequence. The expected NH,by treatment with cynanogen bromide was undertaken. The terminal threonine for peptide 21 was not detected, but as peptides were isolated as described under "Materials and expected, tyrosine was detected upon a second cleavage cycle. Methods" in yields of 2-30%. Amino acid and NH,-terminal Three peptides from different parts of the protein were subjected to quantitative automated Edman degradation. Partial sequences ranging from 12 to 21 residues in length were obtained with average repetitive yields of from 81-94% (Table  111).

DISCUSSION
The data presented permit unambiguous alignment of the glycerol-P acyltransferase on its structural gene, plsB, and establish the primary structure as that inferred from the DNA sequence. This work comprises the first sequence analysis of any phospholipid biosynthetic enzyme. The base sequence suggested two possible start codons in the same reading frame 60 base pairs apart (15). Glycerol-P acyltransferase purified from membranes of Escherichia coli was subjected to automated Edman degradation ( Table  I). The 80% yield may indicate that up to 20% of the protein is not accessible to Edman degradation, and thus may be blocked. No evidence of P T H derivatives corresponding to the sequence starting from the first start codon was found, but protein with a sequence beginning at the second start codon was detected. While these data do not completely rule out the possibility that translation occurs from the first start codon and that the 21 NHz-terminal amino acids are post-translationally removed, the data strongly suggest that translation occurs from the second start codon.
Hydrazinolysis was used to demonstrate glycine as the COOH-terminal residue. The predicted COOH-terminal sequence is -Thr-Gln-Gly-Glu-Gly. Although all the available evidence supports the inferred sequence and, to our knowledge, COOH-terminal processing has not been reported, it is possible that either processing or sequence assignment error could yield some other glycine as the COOH terminus. To rule out this possibility, we performed digestions with carboxypeptidases A and B on solubilized glycerol-P acyltransferase in 0.25 M potassium phosphate, pH 7.0, 0.5% Triton X-100, and 5 mM P-mercaptoethanol, on denatured glycerol-P acyltransferase in 0.1% sodium dodecyl sulfate and on glycerol-P acyltransferase bound to octyl-Sepharose CL-4B (Pharmacia) according to previously described methods (25). No amino acids were released, suggesting inaccessibility of the COOH terminus. However, the deduced sequence is further supported by the fact that 13 of 27 cyanogen bromide peptides predicted from the DNA sequence were isolated and had the correct chemical composition (Table 11). The identification of three of the peptides was confirmed by partial NHz-terminal sequence analysis (Table 111). These peptides would be derived from distinct sequences distributed throughout the protein accounting for 80% of the structural gene (Fig. 3). These results are in full agreement with those inferred from the DNA sequence.

Partial Sequence of sn-Glycerol-3-phosphate Acyltransferase
S u p p l e m e n t a r y M a t e r l a l to: P a T t l l l Nil a n d T h e e l u a n t s were l y o p h l 1 1 z e d .