Isolation and molecular cloning of transferrin from the tobacco hornworm, Manduca sexta. Sequence similarity to the vertebrate transferrins.

An iron-binding glycoprotein of Mr = 77,000 has been isolated from hemolymph of the adult sphinx moth Manduca sexta. Since this protein binds ferric ion both in vivo and in vitro and has a secondary structure similar to that of human serum transferrin and human lactoferrin as judged by CD spectra, we decided to clone its cDNA in order to determine its relationship to the vertebrate transferrins. Antiserum generated against this protein was used to screen a larval fat body cDNA library. A 2.0 kilobase clone was isolated that selects an mRNA which, when translated in vitro, produces an immunoprecipitable 77-kDa protein. When the library was rescreened using the 2.0-kilobase clone as a probe, three full-length clones were isolated, and the complete nucleotide sequence of one 2,183-base pair insert was determined. The deduced protein sequence contains an 18-amino acid signal sequence and a mature protein sequence of 663 amino acids with a calculated Mr of 73,436. The sequence was used to search the National Biomedical Research Foundation (NBRF) protein database, revealing significant similarity to the vertebrate transferrins, a family of 80-kDa glycoproteins which transport and sequester iron in the blood and other body fluids. A multiple sequence alignament shows the greatest areas of similarity to be around the two iron binding sites, although the insect protein seems to contain only one such functional site. Moreover, 23 of the 24 cysteine residues in the insect protein occupy identical positions as compared with the other transferrins, indicating a similar overall tertiary structure. Comparison of the two halves of the insect sequence indicates that the protein may have arisen as a result of gene duplication. The similarity of the M. sexta sequence to the vertebrate transferrins may provide important clues to transferrin evolution.

Transferrins are transport glycoproteins which permit the effective passage of the relatively toxic and readily hydrated ferric ion through the vertebrate vascular system (Huebers and Finch, 1987). The iron bound by the transferrins is thus maintained in a bioavailable form for use in the synthesis of iron-containing proteins such as hemoglobin and the cytochromes. A number of transferrins have been characterized The nucleotide seguenc&s) reported in this paper hns been submitted to the GenBankTMfEMBL Data Bank with accession number(s) M36296. and sequenced including human serum transferrin (HUTF)' (MacGillivray et al, 1983;Yang et al., 1984), chicken ovotransferrin (CHTF) (Jeltsch and Chambon, 1982), human lactoferrin (HLTF) (Metz-Boutigue et aZ., 1984; Rado as quoted in Anderson et al., 1989), and, most recently, a melanotransferrin (HMTF) found in human melanoma cells (Rose et al., 1986). The transferrins contain two domains as determined by x-ray crystallography of HLTF and rabbit serum transferrin (Anderson et uZ., 1989;Bailey et al., 1988) and exhibit extensive internal sequence homology between these domains. Each domain binds a single iron atom as well as a bicarbonate anion (Schlabach and Bates, 1975). The distribution of disulfide bonds in the transferrins is very well conserved; six common disulfides are found in essentially similar positions in the NHz-terminal domain as are the nine common disulfides in the C-terminal domain (Metz-Boutigue et al., 1984).
Although the structure and function of vertebrate transferrins have been well studied, iron binding proteins from invertebrates have yet to be closely scrutinized. Iron binding proteins from a crab (150 kDa) and a tarantula (So-100 kDa) have been described previously (Huebers et al., 1982;Lee et uZ., 1978), as was a transferrin-like protein isolated from a tunicate, Pyuru stoloniferu (Martin et al., 1984). Although all vertebrate transferrins are 80 kDa and bind two ferric ions, the Pyuru protein is 40 kDa and binds a single ferric ion. The two-domain vertebrate transferrins are thought to have arisen by duplication of an ancestral gene encoding a single domain (Greene and Feeney, 1968); this duplication has been confirmed for human serum transferrin by Park et al. (1985). Although the existence of the Pyuru and invertebrate proteins may be evolutionarily significant, it has not been prudent to speculate on their relationship to each other or to vertebrate transferrins because of the lack of sequence information available.
Although transport of fats and carbohydrates in insects has been well studied (Kanost et al., 1990), the transport of micronutrients such as iron has yet to be carefully examined. Insects undoubtedly require a large supply of iron to be used in the cytochrome heme structure of their highly aerobic muscle system. Since insect tissues are known to contain ferritin (Huebers et al., 1988;Nichol and Locke, 1989), what would be needed to complete the iron storage system in insects is a hemolymph transport protein that could receive iron at the gut and transport it to the tissues for storage in the form of ferritin. ' have been isolated and characterized (Kanost et al., 1990). An 80-kDa glycoprotein was tentatively identified as a transferrin because it had a brownish color and bound iron (Kanost et al., 1990;Bartfeld and Law, 1990). While we were initiating the isolation and characterization of this protein, Huebers et al. (1988) reported some of the properties of the same protein from larvae including its involvement in iron transport. In this paper we will present evidence that this protein is an insect transferrin based upon sequence comparisons and structural and functional data. This is the first reported sequence of such a protein and it bears a clear structural relationship to the vertebrate transferrins. 21685 cpm) was analyzed by nondenaturing gel electrophoresis as described above.

MATERIALS AND METHODS
Animals-Adult h4. sexta were raised from eggs supplied by Drs. J. P. Reinecke and J. S. Buckner (U. S. Department of Agriculture, Fargo, ND). After hatching, the larvae were raised as described previously (Prasad et al., 1986a). Iso/~tion of Z'ransjerrin-100 adult animals were bled by the flushing-out method (Chino et al., 1987)  iron Binding Studies-(a) In vivo experiment; 10 &i of 65FeS04 (10 &i/pg) was injected into three adult animals, and they were bled 10 min later. Ten pl of hemolymph (125,000 cpm) were subjected to nondenaturing gel electrophoresis on 4-20% gradient slab geis at 4 "C for 1,700 V-h and stained with Coomassie Blue. The ael was dried and subjected to autoradiography using Kodak X-Ornat-AR film and an intensifying screen at -80 "C!. (b) In uitro experiment; 0.2 mg of transferrin was dissolved in 1.5 ml of 0.01 M EDTA. 0.1 M NaCl. DH 5.5 (Cochran et al., 1984) in order to remove bound i&n and incubyted for 30 min at room temperature in a Centricon cell (Amicon Corp.) followed by centrifugation through a YM-10 membrane.
After washing the membrane three times with 0.05 M sodium citrate, DH 5.5, ""FeSO., (2.5 &i in 0.01 M HCl) was added and the pH slowl; raised to 7.5 with 0.01 M NaHCO? followed bv a 30-min incubation at room temperature.
The sample was centrifuged and the membrane washed three times with 0.02 M Tris-HCl, pH 8.0. A 50-ql sample (120,000 Amino Acid Analysis-Amino acid data were obtained by analysis of duplicate 50-fig samples hydrolyzed in UacUo at 110 "C in 6 N HCl for 24. 48, and 72 h. SamDles were analvzed using a Beckman 7300 amino acid analyzer. Carbohydrate-Analysis-The calorimetric phenol-sulfuric acid assay (Dubois et al., 1956) (Chirgwin et al., 1979). Poly(A)+ RNA was prepared from total RNA by oligo(dT)-cellulose chromatography (Aviv and Leder, 1972).
Hybrid-select Translation-Hybrid-select translation was performed according to Miller et al. (1983) as modified by Kanost et al. (1989)

AND DISCUSSION
Purification of Transferrin-The first step in the purification, KBr density gradient ultracentrifugation, separated the main lipoprotein, lipophorin, from the remainder of the hemolymph proteins . Gel filtration chromatography on Bio-Gel A 1.5 (Fig. 1A) removed the yolk precursor protein, vitellogenin (Osir et al., 1986a), as well as residual lipophorin. Gel filtration on Sephadex G-100 (Fig.  1B) allowed separation of several proteins in the 21-45kDa range as well as apolipophorin III, an 18-kDa dissociable component of lipophorin (Kawooya et al., 1984). At this stage in the purification, the main proteins remaining were the 84-kDa blue-colored biliprotein tetramer, insecticyanin (Riley et al., 1984), and a post-larval protein (Ryan et al., 1988). Insecticyanin was useful in monitoring transferrin-containing fractions from gel filtration columns since the two proteins coeluted. Cibacron blue dye affinity chromatography removed these two proteins by selectively binding transferrin (Fig. 1C). Chromatography on concanavalin A-Sepharose (Fig. 1D) removed all of the minor bands with the exception of a 50-kDa contaminant which was removed by chromatography on hydroxylapatite (Fig. 1E). The final preparation is shown in Fig. 2.

Characterization
of Transferrin-The sequence of the Nterminal 34 amino acids has been determined previously by Edman degradation of the intact protein  and is underlined in Fig. 8. The chemically determined amino acid composition, shown in Table I, is similar to that of human and chicken transferrins, although it is somewhat lower in glycine and cysteine.
Transferrin contained 2% carbohydrate by weight which is at the lower end of the 2-12% range reported for mammalian transferrins. Compositional analysis indicated the presence of mannose and N-acetylglucosamine in a ratio of 51, similar to the 9:2 ratio reported for other M. sexta glycoproteins such as arylphorin (Ryan et al., 1985) and vitellogenin (Osir et al., 1986b), but quite different from those found in the mammalian transferrins which are not as high in mannose and often contain sialic acid (Spik et al., 1979). Calculations indicated the presence of a single oligosaccharide chain per transferrin molecule. The number of chains per molecule in mammalian transferrins ranges from one to four.
The CD spectrum of transferrin (Fig. 3) indicated a structure low in a-helix (13%) and high in B-sheet (55%). The CD spectra of human serum transferrin and human lactoferrin also indicate low a-helical content (20 and 27%, respectively) and high P-sheet content (65 and 60%, respectively) (Mazurier et al., 1976) implying similar secondary structures as compared with the insect protein.
The iron binding studies verified that this protein did indeed bind iron (Fig. 4). Huebers et al. (1988)   determined that the larval transferrin also bound a single iron atom.
Site of Synthesis-The fat body is the site of synthesis of most hemolymph proteins (Kanost et al., 1990). In vitro larval fat body incubation with 135S]methionine (Prasad et al., 1986b) indicated that transferrin was synthesized in the fat body (data not shown). In addition, in vitro translation of poly(A)+ RNA from larval fat body produced a 77-kDa protein that was precipitated with transferrin antibody (Fig. 5, lane   2). The presence of a lower band, a doublet consisting of the two subunits of the larval storage protein arylphorin (Kramer et al., 1980), is due to nonspecific precipitation as it was also precipitated by preimmune serum (Fig. 5, lane 3). This is probably an artifact of the immunoprecipitation since the antiserum was specific as judged by immunoblots (Fig. 6). Additionally, no reaction was observed with preimmune serum. We have no information as to whether other insect tissues may also synthesize transferrin.
Screening of the cDNA Library-A larval fat body cDNA library was screened with the antibody since the larval protein appeared to be identical to the adult protein based on amino acid composition (Huebers et al., 1988), molecular weight, and immunoreactivity (Fig. 6, lanes 1 and 2). Screening with antiserum resulted in the isolation of six positive clones, five of which appeared to have identical 1.7-kilobase inserts as judged by restriction analysis. The sixth clone had a 2.0- kilobase insert. Since the message size was determined to be 2.3 kb as judged by northern blot hybridization , the 2.0-kilobase clone was used as a probe to rescreen the library, resulting in the isolation of three fulllength clones. The three clones (XFl, XF2, XTF3) were identical as judged by restriction analysis with the exception of closely spaced EcoRV and KpnI sites which were present in XTFl but absent in XTF2 and XTF3 (Fig. 7). Characterization of the Transferrin cDNA-In order to confirm the identity of the cDNA, it was used to hybrid-select its corresponding mRNA from larval fat body. The plasmid vector lacking an insert was used as a control. When this selected message was translated in vitro, a 77-kDa protein was synthesized which migrated on SDS-PAGE gels at the same position as the transferrin immunoprecipitated from the total translation mixture (Fig. 5, compare lanes 2 and 4). Although there were background bands due to nonspecific binding of other mRNA species to the filters, transferrin was the main protein synthesized. Furthermore, immunoprecipitation of the products in lane 4, in which much less arylphorin was present than in lane 1, resulted in a single band corresponding to transferrin.
The control plasmid did not select an mRNA (Fig. 5, lane 6  from the stop codon. The poly(A) tail is 63, 21, and 17 nucleotides downstream from these recognition sequences, respectively. The most likely signals for polyadenylation are the ones 21 and 17 nucleotides upstream of the poly(A) tail, since the signals are most often present 11-30 nucleotides upstream from the poly(A) tail (Fitzgerald and Shenk, 1981).
The sequence of the PuuII-PuuII restriction fragment was determined to confirm any sequence differences which would explain the absence of EcoRV and KpnI restriction sites in XTF2 and XTFS. XTF2 contained five nucleotide substitutions (Fig. 8) which did not alter any of the encoded amino acids except for residue 316 which was changed from an aspartate to a glutamate. Two of these substitutions altered the EcoRV and KpnI restriction sites. XTF3 contained the same substitutions with the exception of nucleotide 921 which was not changed. Since the cDNA library was constructed from multiple fat bodies, these clones may represent allelic variants of transferrin present in our M. sexta population. Similarity of the Insect Protein to Vertebrate Transferrins-Comparison of the deduced amino acid sequence with those in the NBRF protein database indicated significant similarity to HUTF, CHTF, and HLTF. The M. se&a sequence (MSTF) was aligned with these sequences as well as the HMTF sequence using the multiple alignment program of Feng and Doolittle (1987) (Fig. 9) in which the sequences are aligned progressively, beginning with the most similar pair. This method produces an alignment which reflects the evolutionary history of the sequences. There were 99 amino acids (15%) conserved among all the transferrins.
MSTF was 26-28% identical to the other transferrins (Table II). The significance of these similarities was assessed by comparing the MSTF sequence with 50 randomly shuffled sequences having the same amino acid composition as either HUTF, CHTF, HLTF, or HMTF using the RDF computer program (K-tuple = 2) (Lipman and Pearson, 1985). The results of this analysis were z values of 33.0 for HUTF, 34.9 for CHTF, 17.1 for HLTF, and 35.5 for HMTF, where z > 10 is considered significant. The other transferrins exhibit identities ranging from 41 to 62% (Table II), thus MSTF is more distantly related to these proteins than they are to each other. The greatest similarity was seen around the iron binding sites. Another feature of the transferrins is the internal homology exhibited between the two domains. HMTF, HUTF, HLTF, and CHTF exhibit 46, 41,37, and 33% similarity between domains, respectively. The FASTp alignment (Lipman and Pearson, 1985) of the two putative domains of MSTF is shown in Fig. 10. The sequence exhibited some internal homology (19%), which indicates that the insect protein may also have arisen by gene duplication. half.
3) The likely conserved disulfide bond distribution in the N-terminal half indicates a more similar folded structure to the mammalian transferrins than the C-terminal half, which potentially lacks four of the conserved disulfides present in the other transferrins, thus the NHz-terminal half may be in a better conformation for binding iron. The relationship of MSTF to the mammalian transferrins, along with the presence of transferrin-like proteins in other arthropods, may provide important clues to the evolution of the transferrins. In addition, isolation of the cDNA will facilitate cloning of the transferrin gene and subsequent studies of its regulation.
The residues involved in iron binding may be tentatively assigned based on the crystal structures of rabbit serum transferrin and HLTF (Bailey et al., 1988;Anderson et al., 1989) and by comparing the conserved residues at the iron binding sites of transferrins whose complete sequences are known. These residues are thought to be 2 tyrosines, a histidine, and an aspartic acid (Fig. 9). In the NHz-terminal domain, Asp-58 and Tyr-188 are conserved among all five proteins. Another conserved tyrosine is replaced by a phenylalanine (residue 95) in the insect protein, although there is a conserved tyrosine at position 96 which may be the actual liganding residue. A conserved histidine is replaced by a glutamine at position 249, although there is another histidine at position 254 which could possibly take its place even though it is not conserved in the mammalian proteins. In addition, Arg-124, which most likely plays a role in binding the bicarbonate anion, is also conserved. In the C-terminal domain, only Asp-392 is conserved (except in HMTF). The remaining putative iron and bicarbonate binding residues have been replaced by other residues (Fig. 9). The definitive assignment of residues involved in iron binding will have to await the determination of a higher resolution x-ray crystal structure.
Another feature of the transferrins is the conservation of disulfide bonds in each domain (Metz-Boutigue et al., 1984). There are six disulfides common to HUTF, CHTF, and HLTF in similar positions in the NH*-terminal domain and nine common disultides in the C-terminal domain. MSTF contains 24 cysteine residues, a somewhat smaller number than the mammalian transferrins (Fig. 9). However, 23 of these 24 residues are conserved among all five transferrins. This implies a similar overall tertiary structure between the insect protein and the other transferrins. Although the locations of the disulfide bonds in MSTF are unknown, it migrates significantly faster on SDS-PAGE gels in the absence of dithiothreitol than in its presence (data not shown), indicating that it is extensively disulfide-bonded. Cysteine residues corresponding to the six common disulfides in the NHz-terminal domain of the transferrins are found in identical positions in MSTF (Fig. 9). Of the nine C-terminal disulfides common to mammalian transferrins, cysteines corresponding to five of these are present in MSTF.
Analysis of the M. sextu sequence may explain the observation that the protein contains a single iron binding site. This site is most likely in the NHz-terminal half of the molecule for several reasons. 1) There are more conserved putative iron binding ligands in the NHz-terminal half of the molecule. 2) There is a greater sequence similarity to the other transferrins in the NHz-terminal half than in the C-terminal