Abstract
Dinucleotide composition has been recognized as a species-specific characteristic of organisms for more than 20 years. Lang (2000, Bioinformatics, 16, 212–221), found that in Monilinia rRNA a species-specific identity is conserved when dinucleotide counts are compressed into net dinucleotide counts (e.g., 50AC + 20CA = 30nAC) and clusters of net dinucleotides of equal value (e.g., 30nAC + 30nCT + 30nTA = 30ACTA) which were called circuits. This study evaluates circuit assemblages (CAs)—the collection of all net dinucleotide circuits derived from a sequence—in a diverse set of 110 HIV-1 genomes. The circuit composition, which is often based on ≤ 15% of the total dinucleotides of a sequence, uniquely characterizes each gene and genome, although the pairwise similarity of the sequences is as low as 70%. Variations in net dinucleotide distributions are associated with structural and functional features of the genome and its proteins. Circuit values of the env signal sequence are different between subtypes that have remained localized and those that have become pandemic. CAs of complete genomes of HIV-1 are similar to other retro-transcribing viruses, and distinct from viroids and single- and double-stranded DNA and RNA viruses. CAs provide a succinct, quantitative, and species-specific description of DNA composition that is consistent with the results of traditional analytic methods at multiple levels of genome organization.
Similar content being viewed by others
References
R. Nussinov, Nucleic Acids Res. 12, 1749–1763 (1984)
S. Karlin, Curr. Opin. Microbiol. 1, 598–610 (1998)
A. Campbell, J. Mrazek, S. Karlin, Proc. Natl. Acad. Sci. USA 96, 9184–9189 (1999)
S. Karlin, L. Brocchieri, J. Mrazek, A.M. Campbell, A.M. Spormann, Proc. Natl. Acad. Sci. USA 96, 9190–9195 (1999)
A.J. Gentles, S. Karlin, Genome Res. 11, 540–546 (2001)
K. Jabbari, G. Bernardi, Gene 333, 143–149 (2004)
D.M. Lang, Bioinformatics 16, 212–221 (2000)
H. Nakashima, K. Nishikawa, T. Ooi, DNA Res. 4, 185–192 (1997)
H. Nakashima, M. Ota, K. Nishikawa, T. Ooi, DNA Res. 5, 251–259 (1998)
R. Sandberg, G. Winberg, C. Branden, A. Kaske, I. Ernberg, J. Coster, Genome Res. 11, 1404–1409 (2001)
O.N. Reva, B. Tummler, BMC Bioinformatics 6, 251 (2005)
F. Gao, E. Bailes, D.L. Robertson, Y. Chen, C.M. Rodenburg, S.F. Michael, L.B. Cummins, L.O. Arthur, M. Peeters, G.M. Shaw, P.M. Sharp, B.H. Hahn, Nature 397, 436–441 (1999)
S. Corbet, M.C. Muller-Trutwin, P. Versmisse, S. Delarue, A. Ayouba, J. Lewis, S. Brunak, P. Martin, F. Brun-Vezinet, F. Simon, F. Barre-Sinoussi, P. Mauclere, J. Virol. 74, 529–534 (2000)
J. Mokili, B. Korber, J. Neurovirol. 11(Suppl 1), 66–75 (2005)
P. Lemey, O.G. Pybus, A. Rambaut, A.J. Drummond, D.L. Robertson, P. Roques, M. Worobey, A.M. Vandamme, Genetics 167, 1059–1068 (2004)
M.M. Vanden Haesevelde, M. Peeters, W. Janssens, G.M. Shaw, G. Jannes, G. vanderGroen, E. Saman, Virology 221, 346–350 (1996)
http://hiv-web.lanl.gov/components/hiv-db/combined_search_s_tree/search.html
http://hiv-web.lanl.gov/content/hiv-db/ALIGN_03/ALIGN-INDEX.html
B. Korber, B.T. Foley, C. Kuiken, S.K. Pillai, J.G. Sodroski, in Human Retroviruses and AIDS 1998, Theoretical Biology and Biophysics Group ed. by B. Korber, C.L. Kuiken, B. Foley, B. Hahn, F. McCutchan, J.W. Mellors, J. Sodroski (LANL, Los Alamos, NM, 1998) pp. III-102–111
D.V. Faulkner, J. Jurka, Trends Biochem. Sci. 13, 321–322 (1988)
http://www.fon.hum.uva.nl/Service/Statistics/Wilcoxon_Test.html
M.A. Massiah, D. Worthylake, A.M. Christensen, W.I. Sundquist, C.P. Hill, M.F. Summers, Protein Sci. 5, 2391–2398 (1996)
C. Tang, Y. Ndassa, M.F. Summers, Nat. Struct. Biol. 9, 537–543 (2002)
R.K. Gitti, B.M. Lee, J. Walker, M.F. Summers, S. Yoo, W.I. Sundquist, Science 273, 231–235 (1966)
D. Braaten, H. Ansari, J. Luban, J Virol 71, 2107–2113 (1997)
H. Javanbakht, R. Halwani, S. Cen, J. Saadatmand, K. Musier-Forsyth, H.G. Gottlinger, L. Kleiman, J. Biol. Chem. 278, 27644–27651 (2004)
A. Land, D. Zonneveld, I. Braakman, FASEB J. 17, 1058–1067 (2003)
M. Dettenhofer, X.F. Yu, J. Biol. Chem. 276, 5985–5991 (2001)
B. Martoglio, R. Graf, B. Dobberstein, EMBO J. 16, 6636–6645 (1997)
Y. Li, L. Luo, D.Y. Thomas, C.Y. Kang, Virology 272, 417–428 (2000)
E.O. Freed, Somat. Cell Mol. Genet. 26, 13–33 (2001)
C. Perales, L. Carrasco, M.E. Gonzalez, Biochim. Biophys. Acta 1743, 169–75 (2005)
T. Rein, H. Zorbas, M.L. DePamphilis, Mol. Cell Biol. 17, 416–426 (1997)
P. Schattner, Nucleic Acids Res. 30, 2076–2082 (2002)
K. Yamagishi, T. Oshima, Y. Masuda, T. Ara, S. Kanaya, H. Mori, DNA Res. 9, 19–24 (2002)
H. Nakashima, S. Fukuchi, K. Nishikawa, J. Biochem. (Tokyo) 133, 507–513 (2003)
Acknowledgments
Dr. John Palfreyman made many helpful suggestions for the manuscript. Doug MacLean provided technical assistance. The support of the University of Abertay-Dundee made the work possible. All are deeply appreciated.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lang, D.M. Circuit assemblages derived from net dinucleotide values provide a succinct identity for the HIV-1 genome and each of its genes. Virus Genes 36, 11–26 (2008). https://doi.org/10.1007/s11262-007-0128-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11262-007-0128-6