Integrated Human Immunodeficiency Virus Type 1 Sequence in J-Lat 10.6

The full length of HIV/R7/E−/GFP integrated in the J-Lat 10.6 cell line was sequenced in this study. The single copy of the integrated virus, including the breakpoints from the human chromosome to the provirus, was amplified by two separate PCRs. A 10,200-bp genome sequence was acquired, analyzed, and deposited in GenBank.

T he J-Lat 10.6 cell line is a subclone derived from Jurkat-based cells infected with a pseudotyped human immunodeficiency virus type 1 (HIV-1) (genus Lentivirus, family Retroviridae) strain, HIV/R7/E Ϫ /GFP (1,2). The integrated HIV-1 copy in this cell line is located in the second intron of SEC16A (chromosome 9, position 136468579), providing a useful cell line for studying HIV-1 latency (3). In order to use J-Lat 10.6 for anti-HIV-1 gene editing and design strategies using the clustered regularly interspaced short palindromic repeats (CRISPR) system, it is necessary to have the full proviral DNA sequence (4)(5)(6)(7)(8)(9)(10)(11). However, the full-length sequence of integrated HIV/R7/E Ϫ /GFP has not been reported. Here, we amplified two overlapping fragments and performed subsequent Sanger sequencing to acquire the whole genome of HIV/R7/E Ϫ /GFP.
Genomic DNA was isolated from J-Lat 10.6 cells utilizing the QIAamp DNA minikit (catalog number 51304; Qiagen) as described by the manufacturer. In order to determine the HIV-1 proviral sequence, the DNA was amplified as two fragments using newly designed primers ( Table 1). The amplicon starting at the 5= end of the provirus (5= amplicon) was 8,999 bp and was amplified using primers based on the reported integration site (3); primers 10.6_up5LTR_F and eGFP-N (ReadyMade Primers, catalog number 51-01-05-05; Integrated DNA Technologies), directed to the N terminus of the gene for enhanced green fluorescent protein (eGFP), which replaces nef in this molecular clone, were used for the 5= amplicon PCR. An adapted single-genome amplification protocol (12) using Platinum Taq polymerase (catalog number 10966026; Invitrogen) was implemented.
The 3= amplicon encompassed eGFP and the 3= long terminal repeat (LTR) and was 2,140 bp. It was amplified using primers Frag-26-R-RC and 10.6_down3LTR_R, with the PCR conditions listed in Table 1. The PCR products were enzymatically purified utilizing ExoSAP-IT PCR product cleanup reagent (catalog number 78201.1.ML; Thermo Fisher Scientific). Sanger sequencing was performed by GENEWIZ, Inc. (South Plainfield, NJ), using Applied Biosystems BigDye version 3.1 and the primers listed in Table 1. The reactions were run on an Applied Biosystems 3730xl DNA analyzer.
Quality control of the sequence was performed by end trimming using average quality scores of Ͼ16 over 21 bp, followed by assembly with default settings in DNASTAR SeqMan (13). The entire HIV-1 proviral genome reported was 10,200 bp, with a GC content of 43.7%. Every nucleotide within HIV/R7/E Ϫ /GFP was sequenced at least twice for a high level of accuracy and was annotated by DNASTAR SeqBuilder for GenBank submission (13). Previously reported mutations in vpr and env and the resulting immature proteins were annotated in the GenBank file (2,14). There is an insertion of thymine and adenine at nucleotide position 6548, which causes a frameshift of env and an early stop codon at amino acid residue 85. The vpr coding region has an insertion of thymine at nucleotide position 5919, which causes a frameshift of vpr and an early stop codon at Vpr amino acid residue 79.
Data availability. The accession number for the genome sequence of HIV/R7/E Ϫ / GFP and the flanking integration site is MN989412.

ACKNOWLEDGMENTS
This work was funded in part by the Public Health Service, National Institutes of Health, through (i) National Institute of Mental Health (NIMH) grant R01 MH110360