Skip to main content
Log in

Cloning and sequencing of a cDNA encoding the major storage proteins of Theobroma cacao

Identification of the proteins as members of the vicilin class of storage proteins

  • Published:
Planta Aims and scope Submit manuscript

Abstract

The major storage proteins, polypeptides of 31 and 47 kilodaltons (kDa), from the seeds of cocoa (Theobroma cacao L.), have been identified and partially purified by preparative gel electrophoresis. The polypeptides were both N-terminally blocked, but some N-terminal amino-acid sequence was obtained from a cyanogen bromide peptide common to both polypeptides, permitting the construction of an oligonucleotide probe. This probe was used to isolate the corresponding copy-DNA (cDNA) clone from a library made from poly(A)+ RNA from immature cocoa beans. The cDNA sequence has a single major open reading frame, that translates to give a 566-amino-acid polypeptide of Mr 65 612. The existence of a common precursor to the 31- and 47-kDa polypeptides of this size was confirmed by immunoprecipitation from total poly(A)+RNA translation products. The precursor has an N-terminal hydrophobic sequence which appears to be a typical signal sequence, with a predicted site of cleavage 20 amino acids after the start. This is followed by a very hydrophilic domain of ∼ 110 amino acids, which, by analogy with the cottonseed α-globulin, is presumed to be cleaved off to leave a domain of approx. 47 kDa, very close to the observed size of the mature polypeptide. Like the hydrophilic domain of the cottonseed α-globulin the cocoa hydrophilic domain is very rich in glutamine and charged residues (especially glutamate), and contains several Cys-X-X-X-Cys motifs. The cyanogen-bromide peptide common to the 47-kDa and 31-kDa polypeptides is very close to the proposed start of the mature domain, indicating that the 31-kDa polypeptide arises via further C-terminal processing. The polypeptide sequence is homologous to sequences of the vicilin class of storage proteins, previously found only in legumes and cotton. Most of these proteins have a mature polypeptide size of approx. 47 kDa, and are synthesised as precursors only slightly larger than this. Some, however, are larger polypeptides (e.g. α-conglycinin from soybean is 72 kDa), usually due to an additional N-terminal domain. In cottonseed the situation appears to parallel that in cocoa in that the vicilin is synthesised as an approx. 70-kDa precursor and then processed to a 47-kDa (and in the case of cocoa also a 31-kDa) mature protein. In this context it is interesting that cotton is closer in evolutionary terms to cocoa than are the legumes, both cotton and cocoa being in the order Malvales.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

Abbreviations

A:

absorbance

cDNA:

copy DNA

IgG:

immunoglobulin G

kb:

kilobase pairs

kDa:

kilodaltons

Mr :

relative molecular mass

SDS-PAGE:

sodium dodecyl sulphate-polyacylamide gel electrophoresis

References

  • Biehl, B., Wewetzer, C., Passern, D. (1982) Vacuolar (storage) proteins of cocoa seeds and their degradation during germination and fermentation. J. Sci. Food Agric. 33, 1291–1304

    Google Scholar 

  • Borroto, K., Dure, L. (1987) The globulin seed storage proteins of flowering plants are derived from two ancestral genes. Plant Mol. Biol. 8, 113–131

    Google Scholar 

  • Bown, D., Ellis, T.H.N., Gatehouse, J.A. (1988) The sequence of a gene encoding convicilin from pea (Pisum sativum) shows that convicilin differs from vicilin by an insertion near the N-terminus. Biochem. J. 251, 717–726

    Google Scholar 

  • Casey, R., Domoney, C., Ellis, N. (1986) Legume storage proteins and their genes. Oxford Surv. Plant Mol. Cell Biol. 3, 1–95

    Google Scholar 

  • Chlan, C.A., Pyle, J.B., Legocki, A.B., Dure, L. (1986) Developmental biochemistry of cotton seed embryogenesis and germination XVIII. cDNA and amino acid sequence of members of the storage protein families. Plant Mol. Biol. 7, 475–489

    Google Scholar 

  • Chlan, C.A., Borroto, K., Kamalay, J.A., Dure, L. (1987) Developmental biochemistry of cotton seed embryogenesis and germination XIX. Sequences and genomic organisation of the α-globulin (vicilin) genes of cotton seed. Plant Mol. Biol. 9, 533–546

    Google Scholar 

  • Cuming, A.C., Williams, R.S., Cullimore, J.V. (1986) The use of antibodies in molecular biology. In: Immunology in plant science, pp 137–154, Wang, T.L., éd. Cambridge University Press, Cambridge, UK

    Google Scholar 

  • Doyle, J.J., Schuler, M.A., Godette, W.D., Zencer, V., Beachy, R.N., Slightom, J.L. (1986) The glycosylated seed storage proteins of Glycine max and Phaseolus vulgaris: structural homologies of genes and proteins. J. Biol. Chem. 261, 9228–9238

    Google Scholar 

  • Gubler, U., Hoffman, B.J. (1983) A simple and very efficient method for generating cDNA libraries. Gene 25, 263–269

    Google Scholar 

  • Hall, T.C., Ma, Y., Buchbinder, B.U., Pyrne, J.W., Sun, S.M., Bliss, F.A. (1978) Messenger RNA for GI protein of french bean seeds: cell-free translation and product characterisation. Proc. Natl. Acad. Sci. USA. 75, 3196–3200

    Google Scholar 

  • Hill, S.A. (1984) Methods in plant virology. Blackwell Scientific Publications, Oxford etc.

    Google Scholar 

  • Kreil, G. (1981) Transfer of proteins across membranes. Annu. Rev. Biochem. 50, 317–348

    MathSciNet  MATH  Google Scholar 

  • Laemmli, U.K. (1970) Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature 227, 680–685

    PubMed  Google Scholar 

  • Lehrian, D.W., Patterson, G.R. (1987) Cocoa fermentation. In: Biotechnology: a comprehensive treatise, vol. 5, pp 529–575, Rehn, H-J., Reed, G., eds. Verlag-Chemie, Weinheim. Deerfield Beach/Florida, Basel

    Google Scholar 

  • Lycett, G.W., Delauney, A.J., Gatehouse, J.A., Gilroy, J., Croy, R.R.D., Boulter, D. (1983) The vicilin gene family of pea (Pisum sativum): a complete cDNA coding sequence for preprovicilin. Nucleic Acid Res. 11, 2367–2380

    Google Scholar 

  • Maniatis, T., Fritsch, E.F., Sambrook, J. (1982), Molecular cloning: a laboratory manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., USA

    Google Scholar 

  • Mason, P.J., Williams, J.G. (1985) Hybridisation in the analysis of recombinant DNA. In: Nucleic acid hybridisation: a practical approach, pp. 113–137, Hames, B.D., Higgins, S.J., eds. IRL Press, Oxford Washington

    Google Scholar 

  • Newbiggin, E.J., de Lumen, B.O., Chandler, P.M., Gould, A., Blagrove, R.J., March, J.F., Kortt, A.A., Higgins, T.J.V. (1990) Pea convicilin: structure and primary sequence of the protein and expression of a gene in the seeds of transgenic tobacco. Planta 180, 461–470

    Google Scholar 

  • Pearson, W.R., Lipman, D.J. (1988) Improved tools for biological sequence analysis. Proc. Natl. Acad. Sci. USA 85, 2444–2448

    Google Scholar 

  • Proudfoot, N.J., Brownlee, G.G. (1976) 3′ non-coding region sequences in eukaryotic messenger RNA. Nature 263, 211–214

    Google Scholar 

  • Saiki, R.K., Gelfand, D.H., Stoffel, S., Scharf, S.J., Higuchi, R., Horn, G.T., Mullis, K.B., Erlich, H.A. (1988) Primer-directed amplification of DNA with a thermostable DNA polymerase. Science 239, 487–491

    CAS  PubMed  Google Scholar 

  • Sanger, F., Nicklen, S., Coulson, A.R. (1977) DNA sequencing with chain-terminating-inhibitors. Proc. Natl. Acad. Sci. USA 74, 5463–5467

    Google Scholar 

  • Sebastiani, F.L., Farrel, L.B., Schuler, M.A., Beachy, R.N. (1990) Complete sequence of a cDNA of an α-subunit of soybean β-conglycinin. Plant Mol. Biol. 15, 197–201

    Google Scholar 

  • Slightom, J.L., Drong, R.F., Klassy, R.C., Hoffman, L.M. (1985) Nucleotide sequences from phaseolin cDNA clones: the major storage proteins from Phaseolus vulgaris are encoded by two unique gene families. Nucleic Acid Res. 13, 6483–6498

    Google Scholar 

  • Spencer, M.E., Hodge, R. (1991) Cloning and sequencing of the cDNA encoding the major albumin of Theobroma cacao. Identification of the protein as a member of the Kunitz protease inhibitor family. Planta 183, 528–535

    Google Scholar 

  • Staden, R. (1986) The current status and portability of our sequence handling software. Nucleic Acid Res. 14, 217–231

    Google Scholar 

  • Talbot, D.R., Adang, M.J., Slightom, J.L., Hall, T.C. (1984) Size and organisation of a multigene family encoding phaseolin, the major seed storage protein of Phaseolus vulgaris. Mol. Gen. Genet. 198, 42–49

    Google Scholar 

  • Von Heijne, G. (1983) Patterns of amino-acids near signal-sequence cleavage sites. Eur. J. Biochem. 133, 17–21

    Google Scholar 

  • Woods, D.E., Markham, A.F., Ricker, A.T., Goldberger, G., Colten, H.R. (1982) Isolation of cDNA clones for the human complement protein factor B, a class III major histocompatability complex gene product. Proc. Natl. Acad. Sci. USA 79, 5661–5665

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

The authors are very grateful to Dr R. Jennings of the Virology Department, Sheffield University Medical School, for help in raising antibodies.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Spencer, M.E., Hodge, R. Cloning and sequencing of a cDNA encoding the major storage proteins of Theobroma cacao . Planta 186, 567–576 (1992). https://doi.org/10.1007/BF00198037

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00198037

Key words

Navigation