Primary Structure of Two Linker Chains of the Extracellular Hemoglobin from the Polychaete Tylorrhynchus heterochaetus*

Two types of linker subunits (linkers 1 and 2) of the extracellular hemoglobin of Tylorrhynchus heterochaetus have been isolated as disulfide-linked homodimers by C18 reverse-phase chromatography. These subunits constituted 6 and 13%, respectively, of total protein area on the chromatogram. The complete amino acid sequences of linkers 1 and 2 were determined by automated Edman sequencing of the peptides derived by digestions with lysyl endopeptidase, trypsin, chymotrypsin, Staphylococcus aureus V8 protease, pepsin, and endoproteinase Asp-N. The linker 1 consisted of 253 amino acid residues (the calculated molecular mass, 28,200 Da), while the linker 2 consisted of 236 residues (26,316 Da). The two chains showed 27% sequence identity. The amino acid sequences of Tylorrhynchus linkers 1 and 2 also showed 23-27% homology with the recently determined sequence of a linker chain of Lamellibrachia hemoglobin (Suzuki, T., Takagi, T., and Ohta, S. (1990) J. Biol. Chem. 265, 1551-1555). In the three linker chains, half-cystine residues were highly conserved; 8 out of 13 residues are identical, suggesting that such residues would contribute to the formation of intrachain disulfide bonds essential for the protein folding of the linker polypeptides. Based on the exact molecular masses of the linker and the heme-containing subunits, the molar ratios estimated for the subunits and the minimum molecular weights per 1 mol of heme, a model is proposed for the subunit structure of the Tylorrhynchus hemoglobin, consisting of 216 polypeptide chains, 192 heme-containing chains, and 24 linker chains.

. The sequence suggested that the linker resulted from gene duplication of a heme-containing chain with a three exon-two intron structure, and that the first exon of domain 1 and the last exon of domain 2 had been lost during evolution (7). In the present study, we succeeded in isolating two linker subunits of Tylorrhynchus hemoglobin, whose all four hemecontaining chains have been already sequenced (8,9), and determined the complete amino acid sequences of the two linker chains. The sequences were shown to have significant homology with that of the Lamellibrachia linker, especially with respect to the positions of half-cystine residues.

MATERIALS AND METHODS
About 300 g of worms that had been stored at -40 or -80 "C were homogenized with 900 ml of cold 50 mM phosphate buffer (pH 7. (lane I): subunits L1 and L2 (55 kDa), subunit T (41 kDa), and subunit M (14 kDa). It is already known that subunit T is a disulfide-bonded trimer of three heme-containing chains and the subunit M is a monomeric chain (10). Upon reduction (lane 5), subunits Ll and L2 dissociated further into 34-36-kDa polypeptide chains, which are about double the size of usual heme-containing chains. Thus, subunits Ll and L2 appeared to be disulfide-bonded dimers. The molecular masses of the constituent polypeptide chains of subunits Ll and L2 were estimated to be about 28 kDa, from the mobilities of the unreduced subunits.
The subunits of Tylorrhynchus hemoglobin were separated on a reverse-phase column, as shown in Fig. 2. Four major fractions were eluted, but the third fraction with an absorbance at 540 nm (data not shown) was identified as heme. The recovery of the protein was more than 80%. The SDS-PAGE of fractions 1, 2, and 4 are shown in lanes 2-4 (unreduced) and lanes 6-8 (reduced) of Fig. 1. Fractions 1, 2, and 4 corresponded to subunits Ll, L2, and T + M, respectively.
The heme and amino acid analyses showed that Tylorrhynthus hemoglobin contains 1 g eq of heme per 20,680 f 1,460 g of protein (n = 5).
The amino acid sequence of Tylorrhynchus linker subunit 1 was determined as follows. Carboxymethylated protein was digested with lysyl endopeptidase (Fig. 3), trypsin (Figs. 4 and 5), S. aureus V8 protease (Fig. 6), pepsin (Fig. 7), and endoproteinase Asp-N (Fig. 8). Amino acid compositions and the results of amino acid sequencing of intact protein and its peptides are given in Tables I and II, respectively. The strategy used to establish the complete sequence is shown in Fig.  9.
The nature of the amino acid residue at position 7 of Tylorrhynchus linker 1 is uncertain. The amino acid sequence at positions 7 and 8 was determined to be Asp-Gly using peptide Ll (see Fig. 9), although yields of Edman cycles became much lower after position 7. We assumed that such low amino acid yields are derived from a cyclized Asn-Gly sequence, and that the original residue at position 7 is Asn and not Asp. The sequence proceeds beyond residue 7 due to a small amount of Asp generated during the isolation procedure of the peptides.
Tylorrhynchw linker chain 1 constituted of 253 amino acid residues, contained 12 half-cystines, and had a calculated molecular mass of 28,200 Da. The isoelectric point of linker 1 was calculated to be 6.2 from the exact amino acid composition.
The amino acid sequence of Tylorrhynchus linker 2 was determined by digestions with lysyl endopeptidase (Figs. lo-12), chymotrypsin (Fig. 13) and endoproteinase Asp-N (Fig.  14). Amino acid compositions and the results of amino acid sequencing of intact protein and its peptides are given in Tables III and IV,  Lum, Lumbricus.
X, unidentified amino acid residue. Asterisks indicate the residues conserved.
half-cystines, and had a calculated molecular mass of 26,316 Da. The isoelectric point was calculated to be 5.1. DISCUSSION We have succeeded in isolating two kinds of linker subunits (linkers 1 and 2) of Tylorrhynchus hemoglobin by a reversephase chromatography.
These subunits constituted 6 and 13% of total protein area on the chromatogram, respectively. In all cases investigated so far, the linkers show a more hydrophilic character relative to heme-containing subunits on reversephase chromatography.
Linker chains would be non-heme proteins, since the minimum molecular mass per 1 mol of heme of Tylorrhynchus hemoglobin was estimated to be 20,680 g, a higher value when compared with those (16,000-17,000 g) of usual hemoglobins.
Gotoh (12) and Suzuki et al. (10) could not find any major protein bands on SDS-PAGE corresponding to linkers in Tylorrhynchus hemoglobin. This indicates that the linkers were completely lost during hemoglobin preparation.
One of the reasons might be due to an addition of 1 mM EDTA to the hemoglobin solution, which is known to protect hemoglobin from oxidation but also to chelate calcium and magnesium ions required for stability of the giant molecular architecture (5).
The SDS-PAGE patterns of Tylorrhynchus linker subunits in the absence or presence of a reducing agent showed that they form disulfide-bonded homodimers of -28-kDa polypeptide chains. This is consistent with the results of the linkers of Arenicola (23), AJephtys (13), Perinereis (14), andNeanthes3 extracellular hemoglobins. But it is also reported that in many cases such as Lumbricus and Lamellibrachia hemoglobins (1,4), the linkers do not form disulfide-bonded dimers. Even in the latter cases it is still possible that the monomeric linkers also form dimers in intact giant hemoglobins.
Amino acid sequence data are available for several linkers, including the N-terminal 28 residues of Lumbricus chain DlA (15), the complete sequence and N-terminal 17 residues of Lamellibrachia chains AV and AVI (7,16), and the complete sequences of Tylorrhynchus linkers 1 and 2 (this work). Fig.  16 compares these N-terminal sequences. In our alignment, only three residues (Gln-44, Arg-47, and Leu-51) appear to be invariant, and the N-terminal sequences exhibit a remarkable variation. We believe that such a variation would not be due to a proteolytic cleavage, since a protease inhibitor was used in the isolation procedure of the hemoglobins.
In any event, it is impossible at this stage to draw any conclusion about the evolutionary relationship between the sequences. The complete amino acid sequences of Tylorrhynchus linkers 1 and 2 are aligned with that of Lamellibrachia linker (chain AV) (7)  sequence homology between Tylorrhynchus linker 2 and Lamellibrachia linker is 27%. In this alignment, 33 residues appear to be invariant, in which 8 half-cystine residues conserved at positions 91, 98, 105, 112, 118, 129, 150, and 246, are especially noted. Such residues would contribute to the formation of some intrachain disulfide bonds essential for the protein folding of linkers. In Tylorrhynchus linker chains, all of the half-cystine residues are conserved exactly. Four halfcystine residues found only in Tylorrhynchus linkers 1 and 2 (positions 10, 12, 223, and 233) are the probable candidates for the site(s) of interchain disulfide bonds. Among them, Cys-10 and Cys-12 are the most probable, because these residues are found in Tylorrhynchus linkers that form disulfide-bonded dimers, but not in Lamellibrachia and Lumbricus linkers (see Fig. 16).
We proposed previously that Lamellibrachia linker chain resulted from gene duplication of a heme-containing 16-17-kDa chain with a three exon-two intron structure, and that the first exon of domain 1 and the last exon of domain 2 had been lost during evolution (7). According to this idea, the second domain begins with position 147 (Fig. 17), and the (21) hemoglobins were determined, and it was shown that a heme-containing chain shows at least 30% homology with the other chain. Moreover, Suzuki (21) found that the amino acid substitution rate for the functionally essential central exon is about 1.5 times slower than that for the structurally essential side exons; a central exon has at least 37% homology with the other central exon but a sideexons has at least 25% homology. Here it is especially noted that the latter homology (25%) is very similar to the homology (23-27%) between Tylorrhynchus and Lamellibrachia linkers. This implies that the linkers evolved with the same rate as that for the structurally essential units of heme-containing chains, and that it would result in the linkers from the absence of function as an oxygen-binding protein, Suzuki and Gotoh (22) proposed a symmetrical "192-chain" model for the subunit assembly of Tylorrhynchus hemoglobin, in which the minimum structural entity is a tetramer consisting of the trimer (subunit T in Fig. 1) and monomer (subunit M in Fig. 11, and the whole molecule is composed of 48 tetramers. However this model should be revised, since the presence of 26-28-kDa linker subunits (Ll and L2) has become evident in this study.
A new molecular model for the %zth submultiple of Tylorrhynchus hemoglobin is proposed in Table V, based on the exact molecular masses of subunits T, M, Ll, and L2, and their molar ratios estimated from peak area on reverse-phase chromatogram (Fig. 2). In the model, a submultiple is composed of four tetramers (subunits T + M) and a disulfidebonded linker subunit, that is 16 heme-containing chains and two 26-28-kDa linker chains. This estimation gave a value of 326.8 kDa for the molecular mass of submultiple, and therefore the minimum molecular mass per 1 mol of heme was calculated to be 20,425 g, in very good agreement with the value of 20,680 g obtained experimentally.
Consequently  (15), except for a difference in the number of linker chains. They estimated that 32-37-kDa non-heme (linker) chains, whose exact masses are not known since amino acid sequences have not been determined, are present in the proportion of one per %zth submultiple (15), but we estimated that two 26-28-kDa linker chains are present in the submultiple as a disulfide-bonded dimer. Both models do not distinguish the two types of linkers, such as Tylorrhynchus linkers 1 and 2 and Lumbricus Dl and D2, since the number of each linker chain does not give a multiple of 12, the number of submultiples in intact molecule. However if the protein foldings of the two types of linkers are similar, it would be considered that they are used as linkers without distinction. In fact, all of the half-cystine residues are conserved in Tylorrhynchus linkers 1 and 2, suggesting a similar protein folding (Fig. 17).