Sequence Evaluation and Comparative Analysis of Novel Assays for Intact Proviral HIV-1 DNA

The intact proviral DNA assay (IPDA) and quadruplex PCR (Q4PCR) represent major advances in accurately quantifying and characterizing the replication-competent HIV reservoir. This study compares the two novel approaches for measuring intact HIV proviral DNA in samples from 39 antiretroviral therapy (ART)-suppressed people living with HIV, thereby informing ongoing efforts to deplete the HIV reservoir in cure-related trials.

quantitative viral outgrowth assays (2). Comparisons of different reservoir measurements provide useful insights into the nature of the HIV reservoir. For example, comparison of reservoir measurements revealed a greater than 2-log discrepancy between the frequency of integrated proviruses measured by DNA PCR and replication-competent proviruses measured by viral outgrowth (3). This discrepancy is largely explained by the disproportionate frequency of defective proviral DNA and variable inducible outgrowth of intact proviruses (4,5). Thus, single-probe HIV DNA PCR-based assays are limited in specificity for intact HIV genomes because a large fraction of the proviruses that they detect are defective.
Although quantitative viral outgrowth assays are specific for intact HIV genomes, they are labor-intensive, and the results vary over time by as much as a factor of 6 (6). Furthermore, they underestimate the frequency of intact HIV genomes because not all proviruses are equally inducible in vitro (5,7). As such, many intact proviruses are not detected in outgrowth assays. To improve on the specificity of PCR-based assays for detection of intact proviruses, novel methods have been developed that combine probes that target relatively conserved regions of the genome (intact proviral DNA assay [IPDA]) (8) in addition to limiting dilution near-full-genome proviral sequencing (a component of Q4PCR) (9).
The IPDA utilizes digital droplet PCR (ddPCR) to measure proviruses with probes targeting conserved regions in the packaging signal (PS) and envelope (env) that when amplified together in the same provirus, exclude the majority of defective proviruses. Input cells are directly measured by a simultaneous ddPCR reaction, enabling the IPDA to report the total frequency of intact and defective proviruses per million input cells. The Q4PCR assay employs long distance PCR at limiting dilution to amplify proviruses, followed by interrogation with four quantitative PCR (qPCR) probes for PS, env, pol, and gag in a multiplex reaction to detect amplified HIV-1 genomes. Cell inputs are estimated by quantification of input DNA, and the frequency of intact proviruses is reported after sequence verification by near-full-length genome sequencing (nFGS). Because the Q4PCR employs limiting dilution near-full-length proviral amplification, these nFGS results can be used to provide insight into the clonal composition of intact proviruses.
Here, we compare quantitative viral outgrowth assays, Q4PCR, IPDA, and total HIV gag DNA measurements on samples from 39 antiretroviral therapy (ART)-suppressed people living with HIV (PLWH) to explore the advantages and limitations of these assays.

RESULTS
Quantitative comparison. Leukapheresis or whole blood samples were obtained from PLWH, stably suppressed (,50 copies of HIV-1 RNA/ml) on ART under protocols approved by the University of North Carolina (UNC) or Rockefeller University biomedical institutional review boards. The majority of study participants were treated during chronic infection and had been on ART for a median of 8.4 years (see Table S1 in the supplemental material). For all 39 participants, Q4PCR and IPDA were performed on DNA extracted from total CD4 (tCD4) T cells from these two cohorts. Total HIV gag DNA was measured in tCD4 T cells using an assay designed to capture group M HIV-1 proviral DNA (10). Finally, we performed quantitative and qualitative viral outgrowth assay (Q 2 VOA) (n = 11, Rockefeller cohort, tCD4 cells) or quantitative viral outgrowth assay (QVOA) (n = 13, UNC cohort, resting CD4 T cells) to measure inducible replication-competent HIV.
As expected, the sum of the IPDA-derived intact, 39-defective, and 59-defective proviral frequencies (IPDA total) yielded the highest proviral frequency estimates (median, 652 copies/million CD4 T cells), and quantitative viral outgrowth assays yielded the lowest across both cohorts (median, 0.60 infectious units/million CD4 T cells) (Fig. 1A). Median proviral frequency using a single-probe long terminal repeat (LTR)-gag assay (gag DNA) was 387 copies/million CD4 T cells. As expected, total HIV DNA frequency  The frequency of intact HIV genomes detected by IPDA and Q4PCR were intermediate between gag DNA and viral outgrowth, consistent with previous studies with these assays and with previous estimates based on limiting dilution proviral nFGS (4,8,9,11). For IPDA, proviruses were considered intact if they were positive for both PS and env probes. For Q4PCR, proviruses were considered intact if they were sequence verified. The median intact proviral frequency per million CD4 T cells was 65 for IPDA and 5 for Q4PCR (Fig. 1A). IPDA measurements were a median of 19 (interquartile range 7 to 48)-fold higher than Q4PCR measurements. Compared to outgrowth assays, the median ratio of gag DNA (779), intact sequences by IPDA (169), or Q4PCR (7) was greater than one (Fig. 1B).
To further understand the relationship between the assays, we calculated the Spearman correlation between gag, IPDA intact, IPDA 39-defective, IPDA 59-defective, infectious units per million (IUPM), and Q4PCR intact ( Fig. 1C and D). IPDA and Q4PCR measures of intact DNA showed similar correlations with quantitative viral outgrowth measures (Spearman r = 0.49 and 0.41, respectively) and with each other (Spearman r = 0.39). IPDA and Q4PCR both moderately correlated with gag DNA (r = 0.61 and r = 0.31, respectively), in agreement with a previous report comparing IPDA and gag measurements (12). The 59-defective and 39-defective proviral frequencies measured using the IPDA also moderately correlated with quantitative viral outgrowth, IPDA intact, and Q4PCR proviral frequencies ( Fig. 1C and D).
IPDA and Q4PCR proviral amplification. We observed amplification signal failure of either the PS or env IPDA primer/probe sets for 7 (18%) of the 39 participants (0 of 12 from the UNC cohort, and 7 of 27 from the Rockefeller cohort). In addition, another small subset of samples (4 of 39; 10.3%) showed reduced fluorescent signal intensities in the PS or env IPDA primer/probe set, thereby complicating the unambiguous identification of a distinct double positive droplet population. With Q4PCR, we were able to identify sequences that were positive for 2 or more probes in 37 of the 39 individuals (9). However, we did not detect intact sequences from 8 of the 39 participants (20.5%).
We determined viral genotypes using the Q4PCR sequencing data. Overall, we identified 3 participants (one from the UNC cohort, two from the Rockefeller cohort) with non-subtype B viral infection (clades A1 and G). For 2 of the 3 non-subtype B-infected individuals, we observed IPDA amplification failures in the env IPDA probe, reflecting the fact that the initial version of the IPDA was designed based on clade B sequence data (8). In contrast, using Q4PCR, we retrieved intact proviral sequences from 2 of the 3 individuals with non-subtype B viruses. The utilization of 4 instead of 2 probes appears to render Q4PCR less sensitive to subtype variability. For example, env amplification failed for participant 5112 in both assays, but several intact proviral genomes scored positive for the PS, pol, and gag probe by Q4PCR (Fig. 2C).
For the fraction of participants harboring clade-B viruses, the frequency of IPDA amplification signal failure was 14.7%, which is similar to but slightly higher than that reported by others in larger cohorts (11) ( Fig. 2A and Table S2). for the IPDA (detectable defective proviruses but no PS1env1 proviruses; see Materials and Methods). Assay data from participants with an amplification failure (7/39 for IPDA, 2/39 for gag) or no recovery of intact proviral sequences for Q4PCR (8/39) are represented as hollow symbols and were excluded from the analysis, but other assay data for those participants were included if available. (B) Frequency ratio with viral outgrowth measures (IUPM) as the denominator for total HIV gag, intact provirus (IPDA), and intact provirus (Q4PCR). Data points that were left-censored, had an IPDA amplification failure, or had no recovered Q4PCR intact proviral sequences were excluded for this analysis. (C) Scatterplots showing correlations of reservoir measurements, including Spearman r and unadjusted P values. Left-censored measurements were included as follows: for Q4PCR, when no intact proviruses were recovered, a value of 0 was used, and for IPDA, a left-censor of 5 copies/million CD4 T cells was used as described in Materials and Methods. Data points were excluded when an IPDA amplification failure was present. (D) Spearman correlation and P value matrices comparing all available reservoir measurements. P values are unadjusted. Left-censored measurements were included as follows: for Q4PCR, when no intact proviruses were recovered, a value of 0 was used, and for IPDA, a left-censor of 5 copies/million CD4 T cells was used as described in Materials and Methods. Data points were excluded when an IPDA amplification failure was present. PS and env sequence conservation among intact proviruses. Overall, we found 506 intact and 4,211 defective proviral sequences by Q4PCR (Table 1). The IPDA and Q4PCR utilize overlapping primer/probe sets in PS and identical primer/probe in env (9). To determine the level of sequence conservation in the primer/probe binding regions in our study participants at a single nucleotide resolution we aligned all 506 intact proviral genomes from Q4PCR nFGS.
As expected, the binding regions of the IPDA/Q4PCR PS and env primer/probe sets showed high levels of conservation among the intact proviral genomes (8,9). Nevertheless, single nucleotide polymorphisms (SNP) could be detected to various degrees in all primers and probes of both PS and env. While the occurrence of common SNPs in the 59 region of primers/probes (e.g., up to 50% in the 59 PS probe region) are expected to have little impact on signal quality, a mismatch in the very 39 end of the primers/probes (e.g., 6% SNP frequency in the IPDA PS forward primer 39 end) can impact signal detection considerably (13,14) (Fig. 2B).
PS and env sequence polymorphism analysis. To examine the importance of HIV diversity on an individual level, we next used the Q4PCR sequence information to study polymorphisms and associated signal patterns for individual participants.
In most cases, the quality of IPDA and Q4PCR signal could be explained by the absence or presence of sequence polymorphisms in the primer/probe binding regions of proviral genomes. For example, intact proviruses from participant UNC-434 showed complete conservation in the primer/probe binding regions, resulting in clear and distinct IPDA and Q4PCR PS and env signals (Fig. 3A). In contrast, the amplification failure of the IPDA PS signal in participant 9243 could be explained by a two-nucleotide mismatch at the 39 end of the PS IPDA forward primer (Fig. 3B). In addition, single nucleotide polymorphisms in the env primer/probe region of intact proviruses from participant 5101 (Fig. 3C) resulted in signal reduction rather than the complete amplification failure for the IPDA. While such a decrease in droplet separation complicates the clear identification of PS1env-positive proviruses by IPDA, PCR annealing temperature decreases were shown to improve droplet resolution (Fig. 4A). Proviral signal patterns (droplet amplitude or qPCR amplification curves) were found to be stable over time in all six individuals that were assayed at more than one time point, thereby demonstrating the consistent impact of sequence polymorphisms on signal patterns. However, occasional discordant results between qPCR amplification and sequencing results require further study.
PS and env intact proviral genome prediction. Intact proviral DNA detection assays face the challenges of both detecting viral sequences and correctly characterizing them as intact or defective. Using nFGS data, Bruner and colleagues reported that the IPDA PS and env primer/probe sets exclude 97% of proviruses with defects detectable by nFGS and that approximately 70% of proviruses classified as intact by IPDA lack defects detectable by nFGS (8).
We used Q4PCR sequencing data to assess the fraction of true intact sequences out of all samples that were amplified by the Q4PCR PS and env primer/probe set combination. We considered a total of 4,717 proviruses collected from all wells positive for 2 or more of the quadruplex qPCRs of the Q4PCR. Notably, proviral sequence was not collected from wells positive for only one qPCR in the Q4PCR, which likely contain defective proviruses (9). A total of 29 PS1env-positive sequences with a threshold cycle (C T ) of .38 were excluded. Out of the remaining 4,688 proviral genomes, only 625, or 13.3%, scored positive with PS1env (C T , ,38 for both amplicons). Out of 4,139 defective proviruses positive for 2 or more of the quadruplex qPCRs in Q4PCR, only 305 (7.4%) were found to be positive for the combination of PS1env, thereby demonstrating the exceptional specificity of this particular probe combination to exclude defective proviruses (9). At the same time, the Q4PCR PS1env combination failed to detect 175 (35.3%) out of 475 intact proviruses. The majority of the 175 PS1env-negative intact genomes were recovered from individuals with amplification signal failures of either the PS or env primer/probe set, and only the utilization of two additional probes (gag and pol) enabled the detection and intact sequence verification by Q4PCR (Fig. 4A). Signal failure was generally consistent between Q4PCR and IPDA, except where there were signal-eliminating polymorphisms in the Q4PCR but not  (Continued on next page) the IPDA PS primer/probe set or vice-versa (Fig. 3B). Importantly, amplicon signal failure in the IPDA is readily apparent, and intact provirus values are not reported in these cases (11). Of all 625 PS1env-positive samples, 320, or only 51.2%, were actually identified as intact proviral genomes on a sequence level. To assess the percentage of PS1env-positive samples that were truly intact by nFGS on an individual participant basis, we analyzed individuals with at least 5 PS1env-positive sequences (Table 2). Importantly, and in line with our previous study, the percentage of PS1env-positive proviruses that were truly intact by nFGS appeared to vary between individuals (Fig. 4B) (9). PS1envpositive proviruses are scored as intact by IPDA. However, we found that the frequencies of PS1env-positive proviruses detected by Q4PCR were a median of 6-fold (interquartile range 2 to 17) lower than IPDA measurements (Table S2). Further, after sequence verification, Q4PCR values for intact viruses were substantially lower (by a median of 19-fold) than IPDA. Both of these results suggest that limitations of long-distance PCR amplification in Q4PCR as well as the IPDA misidentification of some PS and env amplifiable sequences as intact contribute to differences in the quantification of the intact proviral reservoir (Table 3).

DISCUSSION
We compared the results of some of the available HIV-1 reservoir assays, including Q4PCR and IPDA, because quantitation and characterization of the HIV reservoir informs efforts to develop an HIV cure (1). The great majority of integrated HIV proviral DNA is defective. Outgrowth assays quantitate the replication-competent reservoir, but they are labor-intensive and not scalable to large clinical studies. In addition, they do not capture all replication-competent viruses. The IPDA and Q4PCR were designed to address some of these issues. The IPDA utilizes ddPCR to measure proviruses that are PS1env-positive, which together exclude the great majority of defective proviruses with defects detectable by nFGS. This assay enables much more specific quantification of intact proviral DNA than single-probe assays. However, the PS and env probe combination is not entirely predictive of a fully intact provirus (8,9). Q4PCR improves the sensitivity and specificity of PCR-based assays for intact HIV DNA using a combination of qPCR and sequencing (9). The Q4PCR assay employs four different qPCR probes for PS, env, pol, and gag in a multiplex reaction to detect the HIV-1 genome in limiting dilution samples. Samples positive for 2 or more probes are subjected to near-full-length sequencing to verify the intactness of the provirus, enabling a very high specificity for intact proviruses. However, this assay is not scalable for a large number of samples precisely because it requires limiting dilution PCR and full-length proviral DNA sequencing.
Our results show that Q4PCR and IPDA are positively correlated but differ in that IPDA identifies a higher number of intact proviruses than Q4PCR. Several possibilities, which are not mutually exclusive, may explain why intact proviral frequency estimates from Q4PCR were a median of 19-fold lower than those from IPDA (9).
First, limitations of long-distance PCR amplification in Q4PCR may have a positive bias toward amplification and subsequent quantification of defective proviruses due to their shorter length from large deletions (15,16). Inefficient amplification may account for the high degree of variation between the number of estimated intact genomes measured by different nFGS assays (4, 5, 9, 17-21), and it may explain why intact proviruses are not detected by Q4PCR in approximately 20% of infected Q4PCR sequencing data were used to assess the fraction of true intact sequences out of all samples that were amplified by the Q4PCR PS and env primer/probe set combination on an individual participant basis. The graph shows the percentage of proviruses detected by the Q4PCR PS1env primer/probe combination that were truly intact by nFGS for 20 individuals with at least 5 PS1envpositive sequences. The predictive value for intact proviruses is colored in green (PS) and red (env). The defective fraction is shown in gray. The right two bar graphs depict the mean percentage of PS1env-positive samples that were truly intact by nFGS in a pooled analysis of all sequences (first bar graph to the right) or averaged across all 20 participants (second bar graph to the right), respectively.
individuals. In addition, even after correction for defects missed by the IPDA amplicons, intact provirus values measured by IPDA are substantially higher that those obtained by Q4PCR (Table S2). A second factor that may contribute to higher intact proviral measurements in the IPDA versus Q4PCR is cell normalization methods. The IPDA normalizes proviral frequencies to cell input by quantifying host genome input using a parallel ddPCR measurement of the diploid host gene RPP30, while Q4PCR normalizes intact proviral numbers to the cell input based on DNA equivalents after CD4 isolation and DNA extraction, as measured by DNA fluorometry (Qubit).
The third independent factor that may contribute to higher intact proviral estimates from the IPDA may be misidentification of some amplifiable sequences as intact by the PS and env probe combination. Sequencing results from Q4PCR provides some insights into assay performance of the PS1env primer/probe combination. In this large sequencing study (506 intact and 4,211 defective proviral sequences recovered), we found that the Q4PCR PS and env probe combination excluded 93% of defective proviruses in this data set, consistent with the original report (8). With regard to the  prediction of intact proviruses by the PS and env primer/probe combination, we observed that approximately 51% of the proviruses identified by the Q4PCR PS and env primer/probe combination are truly intact by nFGS. This is somewhat less than the 70% estimate from Bruner and colleagues based on analysis of 431 near-full-genome HIV-1 sequences (2.4%, or 10, of which were intact) obtained by single genome analysis from 28 HIV-1-infected adults (8). As observed for Q4PCR, it is also conceivable that the precision of the IPDA PS and env probes to identify intact proviruses varies across individuals and should be considered in the interpretation of IPDA results. However, sequencing of proviruses in IPDA PS1env double positive droplets is needed to confirm this potential limitation (22), as our assessment of the frequency of PS1env-positive proviruses that are truly intact may be limited by the efficiency of nFGS. Despite a variable ability to predict intact proviral genomes, the IPDA PS and env probe combination appears to measure the decay of intact proviruses, as evidenced by the differential decay of intact versus defective proviruses (15,23). Therefore, the IPDA is a useful high-throughput assay to establish an upper limit on the frequency of provirus that may be replication-competent. Taken together, the IPDA affords a higher throughput capacity, but the combination of PS and env probes can both misclassify defective sequences and fail to detect intact ones due to sequence polymorphisms. In contrast, the sequencing information in Q4PCR enables a very high specificity for intact proviruses, but it is not a highthroughput assay, and Q4PCR quantitation of sequence-confirmed intact proviruses may be limited by the inefficiency of nFGS. Both assays are subject to probe amplification failures and HIV-1 subtype variation (11,24), though Q4PCR may be less sensitive to this because of the use of 4 probes and subsequent sequencing of positive wells. Notably, both assays require minimal cell input relative to classical QVOA, which is a clear advantage for clinical trials.
In conclusion, the differences across reservoir assays highlight that the predominantly defective and highly polymorphic proviral landscape makes the measurement of virus that is likely to cause rebound upon ART cessation extremely challenging. Nonetheless, the IPDA and Q4PCR both represent major advances in accurately quantifying and characterizing the replication-competent HIV reservoir. When used together, the two assays provide a deeper understanding of the reservoir than either assay alone and bracket true reservoir size.
Intact proviral DNA assay. Detailed parameters and primer/probe sequences for the intact proviral DNA assay are as previously described (8,25). Briefly, DNA was extracted from total CD4 T cells using the Qiagen QiaAmp DNA extraction kit as described, including the RNase A step (8). DNA was measured on the NanoDrop 1000 spectrophotometer (Thermo Fisher Scientific). Then, 12 to 16 replicate wells of up to 800 ng DNA each were combined with 2X ddPCR supermix for probes (no dUTP; Bio-Rad) and primer/ probe sets for regions in PS and env that favor amplification of intact proviral genomes, as described (3). Final concentrations of 750 nM each primer and 250 nM each probe (6-carboxyfluorescein [FAM]/minor groove binder [MGB] for packaging signal probe; 29-chloro-79phenyl-1,4-dichloro-6-carboxyfluorescein [VIC]/MGB for intact env probe; unlabeled/MGB for hypermutated exclusionary env probe) were used (Thermo Fisher Scientific); all probes contained a 39 nonfluorescent quencher. Droplets were generated on the automated droplet generator (Bio-Rad) and subsequently underwent thermocycling on the C1000 Touch thermal cycler with a 96-deep well reaction module (Bio-Rad catalog no. 1851197). Thermal cycling conditions with a 2°C ramp rate were 10 min at 95°C followed by 45 cycles (30 s at 94°C and 60 s at 59°C per cycle) and 10 min at 98°C, followed by incubation at 12°C as described (6). Droplets were read on the QX200 droplet reader (Bio-Rad catalog no. 1864003) using QuantaSoft software version 1.7.4.0917, with the exception of two samples, one (5114) with a confirmed PS polymorphism (Table S2), which made gating difficult, that were read on the QX100 droplet reader. Replicate wells were merged and analyzed in the QuantaSoft Analysis Pro software.
As described by Bruner and colleagues, a correction for DNA shearing was applied based on a separate, duplicate measurement of two regions of the diploid host gene RPP30 (4). RPP30 amplicons were spaced to match the distance between the PS and env IPDA HIV amplicons in order to quantify and correct for DNA shearing that occurs between the PS and env amplicons during the DNA extraction. DNA shearing indices were highly consistent with other reports of IPDA measurements (median, 0.35; interquartile range [IQR], 0.33 to 0.37) (Table S2) (8,11). The RPP30 reaction was also used to normalize for cell input as described; for each individual, a median of 1.01 Â 10 6 (IQR, 8.35 Â 10 5 to 1.16 Â 10 6 ) CD4 T cell genome equivalents (2 copies of the RPP30 gene) were assayed (8).
Buffer-only no-template controls (NTC) as well as DNA from HIV-seronegative donors were used as a guide to set thresholds for positive droplets on each digital PCR plate. In our hands, we observed occasional low-amplitude false-positive droplets in NTC wells at a rate less than or equal to 5 copies/million CD4 T cells, depending on the run. Based on the observed false-positive rate and the subsampling constraints of digital PCR, IPDA proviral frequencies were censored at 5 copies/million CD4 T cells. In some individuals, a proviral polymorphism in primer/probe binding regions resulted in a droplet spread phenotype that was difficult to set thresholds for because of insufficient droplet separation from the negative population (Fig. 4A, Table S2). These 4 individuals were included in subsequent analyses (Table S2). In some individuals, a proviral polymorphism precluded PCR amplification of one of the IPDA amplicons (PS or env) (Table S2). These individuals' IPDA data were excluded from the analyses.
gag HIV DNA assay. DNA was extracted from total CD4 T cells using the Qiagen QiaAmp DNA extraction kit. Up to 8 replicate wells of up to 500 ng of DNA was added to ddPCR supermix for probes (no dUTP) (Bio-Rad) with a primer/probe set overlapping the LTR-gag junction designed to measure group-M HIV-1 proviral load (10); 900 nM each primer and 250 nM probe were used. Droplets were generated on the automated droplet generator (Bio-Rad) and subsequently underwent thermocycling with a 2°C ramp rate for 10 min at 95°C for 45 cycles (30 s at 94°C and 60 s at 59°C per cycle) and 10 min at 98°C prior to reading on the QX200 droplet reader. RPP30 reactions were used to normalize for cell input. In 2 individuals, no positive droplets were identified. These were considered PCR amplification failures and excluded from further analysis because positive droplets were identified using the partially overlapping IPDA PS primer/probe set in the same 2 individuals.
Q4PCR. Q4PCR was performed as previously described (9). Briefly, genomic DNA from 1 million to 5 million total CD4 1 T cells was isolated using the Gentra Puregene cell kit (Qiagen) or phenol-chloroform, and the DNA concentration was measured using a Qubit high-sensitivity kit (Thermo Fisher Scientific). Next, an outer PCR (NFL1) was performed on genomic DNA at a single-copy dilution (previously determined by gag limiting dilution) using outer PCR primers BLOuterF (59-AAATCTCTA GCAGTGGCGCCCGAACAG-399) and BLOuterR (59-TGAGGGATCTCT AGTTACCAGAGTC-399 [5]). Undiluted 1-ml aliquots of the NFL1 PCR product were subjected to a Q4PCR reaction using a combination of four primer/probe sets that target conserved regions in the HIV-1 genome. Each primer/probe set consists of a forward and reverse primer pair as well as a fluorescently labeled internal hydrolysis probe as previously described (9)  . Each Q4PCR reaction was performed in a 10-ml total reaction volume containing 5 ml TaqMan universal PCR master mix containing Rox (catalog no. 4304437; Applied Biosystems), 1 ml diluted genomic DNA, nuclease-free water, and the following primer and probe concentrations: PS, 675 nM forward and reverse primers with 187.5 nM PS internal probe; env, 90 nM forward and reverse primers with 25 nM env internal probe; gag, 337.5 nM forward and reverse primers with 93.75 nM gag internal probe; and pol, 675 nM forward and reverse primers with 187.5 nM pol internal probe. qPCR conditions were 94°C for 10 min, 40 cycles of 94°C for 15 s, and 60°C for 60 s. All qPCRs were performed in a 384-well plate format using the Applied Biosystem QuantStudio 6 Flex realtime PCR system. qPCR data analysis was performed as previously described (9). Generally, samples showing reactivity with two or more of the four qPCR probes were selected for a nested PCR (NFL2). The NFL2 reaction was performed on undiluted 1-ml aliquots of the NFL1 PCR product. Reactions were performed in a 20-ml reaction volume using Platinum Taq high-fidelity polymerase (Thermo Fisher Scientific) and PCR primers 275F (59-ACAGGGACCTGAAAGCGAAAG-399) and 280R (59-CTAGTTACC AGAGTCACACAACAGACG-399 [5]) at a concentration of 800 nM. Library preparation and sequencing were performed as previously described (9).
As previously described, HIV-1 sequence assembly was performed by our in-house pipeline (Defective and Intact HIV Genome Assembler), which is capable of reconstructing thousands of HIV genomes within hours via the assembly of raw sequencing reads into annotated HIV genomes (26). The steps executed by the pipeline are described briefly as follows. First, we removed PCR amplification and performed error correction using clumpify.sh from the BBtools package v38.72 (https://sourceforge.net/projects/bbmap/). A quality control check was performed with the Trim Galore package v0.6.4 (https://github.com/FelixKrueger/ TrimGalore) to trim Illumina adapters and low-quality bases. We also used bbduk.sh from the BBtools package to remove possible contaminant reads using HIV genome sequences, obtained from the Los Alamos HIV database, as a positive control. We used a k-mer-based assembler, SPAdes v3.13.1, to reconstruct the HIV-1 sequences. The longest assembled contig was aligned via BLAST to a database of HIV genome sequences, obtained from Los Alamos, to set the correct orientation. Finally, the HIV genome sequence was annotated by aligning it against Hxb2 using BLAST. Sequences with double peaks, i.e., regions indicating the presence of two or more viruses in the sample (cutoff consensus identity for any residue, ,70%), or samples with a limited number of reads (empty wells, #500 sequencing reads) were omitted from downstream analyses. In the end, sequences were classified as intact or defective (26).
Quantitative viral outgrowth assays. The quantitative and qualitative viral outgrowth assay (Q2VOA) was performed as previously described (27,28). In brief, isolated CD4 1 T cells were activated with phytohemagglutinin and interleukin 2 (IL-2) (Peprotech) and cocultured with 1 Â 10 6 irradiated PBMCs from a healthy donor in 24-well plates. After 24 h, phytohemagglutinin (PHA) was removed, and 0.1 Â 10 6 MOLT-4/CCR5 cells were added to each well. Cultures were maintained for 2 weeks, splitting the cells 7 days after the initiation of the culture and every other day after that. Positive wells were detected by measuring p24 by enzyme-linked immunosorbent assay (ELISA). The frequency of latently infected cells was calculated through the IUPM algorithm developed by the Siliciano laboratory (http:// silicianolab.johnshopkins.edu) as previously described (27).
The quantitative viral outgrowth assay (QVOA) was performed on resting CD4 T cells for participants from the UNC cohort. Lymphocytes were obtained by continuous-flow leukapheresis and isolation of resting CD4 1 T cells. Recovery and quantification of replication-competent virus was performed as described elsewhere (29). In general, approximately 50 million resting CD4 1 T cells were plated in replicate-limiting dilutions of 2.5 million (18 cultures), 0.5 million (6 cultures), or 0.1 million (6 cultures) cells per well, activated with phytohemagglutinin (Remel), a 5-fold excess of allogeneic irradiated PBMCs from a seronegative donor, and 60 U/ml interleukin 2 for 24 h. Cultures were washed and cocultivated with CD8-depleted PBMCs collected from selected HIV-seronegative donors screened for adequate CCR5 expression. Culture supernatants were harvested and assayed for virus production by p24 antigen-capture enzyme-linked immunosorbent assay (ABL). Cultures were scored positive if p24 was detected at day 15 and was increased in concentration at day 19. The number of resting CD4 1 T cells in infected units per billion was estimated using a maximum likelihood method (30).
Data availability. Proviral sequences have been deposited in GenBank with the accession no. MN090188 to MN090943, MT189273 to MT191207, and MW059111 to MW063110.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only. SUPPLEMENTAL FILE 1, PDF file, 0.1 MB.