Primary structure of a linker subunit of the tube worm 3000-kDa hemoglobin.

The deep-sea tube worm Lamellibrachia contains two giant extracellular hemoglobins, a 3000-kDa hemoglobin and a 440-kDa hemoglobin. The former consists of four heme-containing chains (AI-AIV) and two linker chains (AV and AVI) for the assembly of the heme-containing chains. The 440-kDa hemoglobin consists of only four heme-containing chains (Suzuki, T., Takagi, T., and Ohta, S. (1988) Biochem. J. 255, 541-545). The complete amino acid sequence of a linker subunit (chain AV) has been determined by automated Edman sequencing of the peptides derived by digestions with lysyl endopeptidase and endoproteinase Asp-N. The chain is composed of 224 amino acid residues, and the molecular mass for the protein moiety was calculated to be 24,894 Da. An Asn-X-Thr sequence which is possible as a glycosylation site was suggested at positions 108-110. A computer-assisted homology search showed that the sequence shows no notable homology with any other globins and proteins. However a careful alignment of the linker sequence with a heme-containing chain sequence suggested that there is a slight, but significant homology between the two sequences. The alignment also suggested that the linker resulted from gene duplication of a heme-containing chain with a three exon-two intron structure, and that the first exon of domain 1 and the last exon of domain 2 had been lost during evolution. In our alignment, domain 1 has the heme-binding proximal histidine, but domain 2 does not. This is the first linker subunit to be sequenced completely.

The deep-sea tube worm Lamellibrachia contains two giant extracellular hemoglobins, a 3000-kDa hemoglobin and a 440-kDa hemoglobin. The former consists of four heme-containing chains (AI-AIV) and two linker chains (AV and AVI) for the assembly of the heme-containing chains.
The complete amino acid sequence of a linker subunit (chain AV) has been determined by automated Edman sequencing of the peptides derived by digestions with lysyl endopeptidase and endoproteinase Asp-N. The chain is composed of 224 amino acid residues, and the molecular mass for the protein moiety was calculated to be 24,894 Da. An Asn-X-Thr sequence which is possible as a glycosylation site was suggested at positions 108-l 10. A computer-assisted homology search showed that the sequence shows no notable homology with any other globins and proteins. However a careful alignment of the linker sequence with a heme-containing chain sequence suggested that there is a slight, but significant homology between the two sequences.
The alignment also suggested that the linker resulted from gene duplication of a heme-containing chain with a three exon-two intron structure, and that the first exon of domain 1 and the last exon of domain 2 had been lost during evolution. In our alignment, domain 1 has the heme-binding proximal histidine, but domain 2 does not. This is the first linker subunit to be sequenced completely.
One of the most remarkable differences of annelid 3000-4000-kDa hemoglobins from other invertebrate and vertebrate hemoglobins is characterized by their low heme (iron) content. This reason was derived from the presence of non-heme 31-37-kDa chains that act as 'linkers' for the assembly of hemecontaining chains in construction of a giant molecular architecture (Vinogradov, 1985;Vinogradov et al., 1986 32-36-kDa subunits which are proposed to be the linkers (Suzuki et al., 1988(Suzuki et al., , 1989. In this paper, we report the first complete amino acid sequence of a linker subunit from the Lamellibrachia 3000-kDa hemoglobin.  (Suzuki et al., 1988). The chain AV-1 was carboxymethylated as described previously (Suzuki and Gotoh, 1986). The chain (7 nmol 150 mM glucose at a flow rate of 0.5 ml/min.
The eluate was monitored for absorbance at 280 nm, and the elution profile was obtained by subtracting its base line.

RESULTS
Lamellibrachia 3000-kDa hemoglobin has two types of linker subunits, chains AV and AVI at a ratio of about 5:l. The major linker chain AV can be separated further into two isoforms AV-1 and AV-2 by reverse-phase chromatography, but their amino acid sequences differ only at one position in the N-terminal 40 residues examined (Suzuki et al., 1988). In the present study, we confirmed that peptide mapping (lysyl 'Portions of this paper (including Tables  1 and 2  Automated sequencer (0) was employed for sequence determination. Key: L, a lysyl endopeptidase peptide; A, a endoproteinase Asp-N peptide. endopeptidase digestion) for both chains gave an almost indentical result and that all the amino acid compositions of peptides corresponding to each other were indistinguishable (data not shown). Thus the amino acid sequences of chains AV-1 and AV-2 appeared to be almost identical. We selected chain AV-1 for sequence determination.
The amino acid sequence of a linker subunit (chain AV-1) of the 3000-kDa hemoglobin from the deep-sea tube worm Lamellibrachia was determined by automated Edman degradation of intact protein and peptides derived by cleavage with lysyl endopeptidase and endoproteinase Asp-N. The strategies used to establish the complete sequence are summarized in Fig. 1. The alignment of the peptides is supported by at least 3 overlapped residues. Peptide L5 was eluted in two different positions on a reverse-phase column (Fig. 4 in the Miniprint). However no difference was found in amino acid compositions and N-terminal sequences of 34 residues for the two L5 peptides. To confirm the C-terminal sequence, the chain AV was cleaved with CNBr. Consequently, the homoserine-free peptide corresponding to the positions 212-224 was obtained, indicating the accuracy of the C-terminal sequence proposed (data not shown). The residue at position 108 was not detected by Edman degradation of the peptides L5 and A4, but the amino acid composition of peptide A4 clearly suggested that the residue is Asx. Moreover, since the peptide L5 bound to an immobilized concanavalin A column (data not shown), we assumed that it is asparagine attached with carbohydrate group in consideration of the threonine (position 110) present 2 residues after. An Asn-X-Thr sequence is well known as a glycosylation site. As shown in Fig. 1, Lamellibrachia linker chain AV-1 is composed of 224 amino acid residues and has a calculated molecular mass of 24,894 Da for the protein moiety. There is a difference in the molecular masses from sequencing (25 kDa) and sodium dodecyl sulfate-polyacrylamide gel electrophoresis (32 kDa) reported previously (Suzuki et al., 1988), but the carbohydrate groups might contribute to the difference to some extent, although we could not determine them because of a very small amount of the sample available.

DISCUSSION
The invertebrate 3000-4000-kDa hemoglobins found in annelids and tube worms have a characteristic linker subunit for the assembly of heme-containing chains (Vinogradov et al., 1986;Suzuki et al., 1988). The molecular mass of linker chains is about the double size of usual heme-containing chain, showing a value of 30-37 kDa on sodium dodecyl sulfate-polyacrylamide gel electrophoresis (Vinogradov, 1985). In addition, at least a major part of the linker chains was suggested to exist as a non-heme protein from heme and protein analyses. Although the stoichiometry of the linker chain to heme-containing chains is not well established, it is likely that one or two linker subunits are present per onetwelfth subunit of intact molecule (Vinogradov et al., 1986;Fushitani and Riggs, 1988). One of the biological interests for linker chain is its evolutionary origin, that is whether the linker is derived from a protein quite different from hemoglobin or results from gene duplication of heme-containing chain. This work is aimed at answering this question.
We have determined the amino acid sequence of 224 residues of Lamellibrachia chain AV-1. This is the first linker subunit of annelid-like giant multisubunit hemoglobin to be sequenced completely. The N-terminal amino acid sequence of Lamellibrachia linker chain shows a low, but significant homology (23% identity) with the partial sequence (28 residues) of Lumbricus linker chain (Fushitani and Riggs, 1988)  A computer-assisted homology search (the GENAS, Kyushu University) for the complete amino acid sequence of Lamellibrachia linker chain AV was carried out, but no protein with notable homology was found. However, in consideration of the fact that the sequence homology between the homoglobin chains is relatively low, then we have tried to align carefully the sequence of linker chain with those of related heme-containing chains, Lamellibrachia chain AI11 (Suzuki et al., 1990) and Lumbricus chain c (Jhiang et al., 1988). First, taking notice of the two structurally important cysteine residues (positions 8 and 146 in Fig. 2) characteristic of all hemecontaining chains of annelid and Lamellibrachia hemoglobins, we aligned them roughly by eye. Then the alignment was improved by a computer program based on the algorithm of Feng et al. (1985). As shown in Fig. 2, consequently, it was found that there is a slight, but significant similarity between the three chains. This alignment also suggests that the linker chain has a two-domain structure ( the linker show 14 and 12% homologies, respectively, with that of Lamellibrachia chain AIII. Here if we consider that the globin gene, including that of a heme-containing chain (chain c) of giant hemoglobin from the earthworm Lumbricw terrestris (Jhiang et al., 1988), has a two intron-three exon structure, it would be found that such large deletions found in N-and C-terminal regions correspond just to the deletions of first exon of domain 1 and of last exon of domain 2, respectively. If so, this also supports the idea that the linker chain might result from gene duplication of a heme-containing chain. Consistent with this, Lightbody et al. (1988) found from a monoclonal antibody study that the linker chain of Lumbricus giant hemoglobin is related to its heme-containing chain I.
The hydropathy profile (Kyte and Doolittle, 1982) of Lamellibrachia linker chain AV-1 was compared with that of heme-containing chain AI11 (Suzuki et al., 1990). As shown in Fig. 3, the profile of domain 1 of the linker slightly resembled corresponding profile of heme-containing chain, but that of domain 2 was rather different, suggesting that the domain 1 might take a globin folding even though incomplete. Since the heme-binding proximal histidine is conserved in domain Worm Giant Hemoglobin 1553 1 in our alignment (Fig. 2), it might be possible that the domin 1 carries a heme group. The functionally important distal histidine is replaced by glutamine in the domains of linker chain (Fig. 2), as in the case of several invertebrate globins. In contrast, domain 2, lacking the proximal histidine, could never carry a heme. The steric location of linker chains in the gross quaternary structure is unknown. Vinogradov et al. (1986) assumed that the linker chains form a closed circular collar ("bracelet" model) with I2 complexes of 16 heme-containing chains each, or it is also possible that they act for the assembly of hemecontaining chains without a continuous bracelet. Anyway, three-dimensional structure of intact giant hemoglobin must be awaited.
The tube worm giant hemoglobins are believed to have a function to transport H2S (Arp et al., 1987). As a H2S binding site, we proposed previously that a free cysteine residue is responsible for sulfide binding ability (Suzuki et al., 1990). Since Lamellibrachia chain AV-1 is rich in half-cystines (9 residues, at positions 64,71,77,84,90,101,120,173, and 216 (Fig. 1)) and our preliminary experiment by using a fluorescent labeling suggested that chain AV has a reactive thiol in the intact Lamellibrachia 3000-kDa molecule (Suzuki et al., 1990), it is likely that the chain AV acts also as a H2S carrier.