skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: HIV sequence compendium 2002

Technical Report ·
DOI:https://doi.org/10.2172/1184349· OSTI ID:1184349

This compendium is an annual printed summary of the data contained in the HIV sequence database. In these compendia we try to present a judicious selection of the data in such a way that it is of maximum utility to HIV researchers. Traditionally, we present the sequence data themselves in the form of alignments: Section II, an alignment of a selection of HIV-1/SIVcpz full-length genomes (a lot of LAI-like sequences, for example, have been omitted because they are so similar that they bias the alignment); Section III, a combined HIV-1/HIV-2/SIV whole genome alignment; Sections IV–VI, amino acid alignments for HIV-1/SIV-cpz, HIV-2/SIV, and SIVagm. The HIV-2/SIV and SIVagm amino acid alignments are separate because the genetic distances between these groups are so great that presenting them in one alignment would make it very elongated because of the large number of gaps that have to be inserted. As always, tables with extensive background information gathered from the literature accompany the whole genome alignments. The collection of whole-gene sequences in the database is now large enough that we have abundant representation of most subtypes. For many subtypes, and especially for subtype B, a large number of sequences that span entire genes were not included in the printed alignments to conserve space. A more complete version of all alignments is available on our website, http://hiv-web.lanl.gov/content/hiv-db/ALIGN_CURRENT/ALIGN-INDEX.html. Importantly, all these alignments have been edited to include only one sequence per person, based on phylogenetic trees that were created for all of them, as well as on the literature. Because of the number of sequences available, we have decided to use a different selection principle this year, based on the epidemiological importance of the subtypes. Subtypes A–D and CRFs 01 and 02 are by far the most widespread variants, and for these (when available) we have included 8–10 representatives in the alignments. The other subtypes and CRFs are of lesser importance, and of these 4–5 each, or as many as are available, were included. In the alignments we have also included the ‘Circulating Recombinant Forms’, mosaic genomes that have epidemiological significance. See the 1999 review of nomenclature (http://hiv-web.lanl.gov/content/hiv-db/REVIEWS/nomenclature/Nomen.html) for more on CRFs, and see for an overview of the patterns of known CRFs. Amino acid alignment chapters begin with an annotation table that includes sequence names, accession numbers, genomic region represented, author, and references. We have made an effort to bring the HIV-2/SIV and SIVagm alignments up-to-date as well.

Research Organization:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
W-7405-ENG-36
OSTI ID:
1184349
Report Number(s):
LA-UR-03-3564
Country of Publication:
United States
Language:
English

Similar Records

HIV Sequence Compendium 2010
Technical Report · Fri Dec 31 00:00:00 EST 2010 · OSTI ID:1184349

Updated HIV-1 Consensus Sequences Change but Stay Within Similar Distance From Worldwide Samples
Journal Article · Mon Jan 31 00:00:00 EST 2022 · Frontiers in Microbiology · OSTI ID:1184349

HIV classification using coalescent theory
Journal Article · Tue Jan 01 00:00:00 EST 2008 · OSTI ID:1184349

Related Subjects