HIV integration site selection: Analysis by massively parallel pyrosequencing reveals association with epigenetic modifications

  1. Gary P. Wang1,
  2. Angela Ciuffi1,
  3. Jeremy Leipzig1,
  4. Charles C. Berry2, and
  5. Frederic D. Bushman1,3
  1. 1 University of Pennsylvania, School of Medicine, Department of Microbiology, Philadelphia, Pennsylvania 19104-6076, USA;
  2. 2 Department of Family/Preventive Medicine, University of California, San Diego School of Medicine, San Diego, California 92093, USA

Abstract

Integration of retroviral DNA into host cell DNA is a defining feature of retroviral replication. HIV integration is known to be favored in active transcription units, which promotes efficient transcription of the viral genes, but the molecular mechanisms responsible for targeting are not fully clarified. Here we used pyrosequencing to map 40,569 unique sites of HIV integration. Computational prediction of nucleosome positions in target DNA indicated that integration sites are periodically distributed on the nucleosome surface, consistent with favored integration into outward-facing DNA major grooves in chromatin. Analysis of integration site positions in the densely annotated ENCODE regions revealed a wealth of new associations between integration frequency and genomic features. Integration was particularly favored near transcription-associated histone modifications, including H3 acetylation, H4 acetylation, and H3 K4 methylation, but was disfavored in regions rich in transcription-inhibiting modifications, which include H3 K27 trimethylation and DNA CpG methylation. Statistical modeling indicated that effects of histone modification on HIV integration were partially independent of other genomic features influencing integration. The pyrosequencing and bioinformatic methods described here should be useful for investigating many aspects of retroviral DNA integration.

Footnotes

  • 3 Corresponding author.

    3 E-mail bushman{at}mail.med.upenn.edu; fax (215) 573-4856.

  • [Supplemental material is available online at www.genome.org. The sequence data from this study have been submitted to GenBank under accession nos. EI522403–EI666579, and the raw data for transcriptional profiling have been deposited in NCBI Gene Expression Omnibus under accession no. GSE7508.]

  • Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.6286907

    • Received January 18, 2007.
    • Accepted April 10, 2007.
| Table of Contents

Preprint Server