Sexually-Transmitted/Founder HIV-1 Cannot Be Directly Predicted from Plasma or PBMC-Derived Viral Quasispecies in the Transmitting Partner

Objective Characterization of HIV-1 sequences in newly infected individuals is important for elucidating the mechanisms of viral sexual transmission. We report the identification of transmitted/founder viruses in eight pairs of HIV-1 sexually-infected patients enrolled at the time of primary infection (“recipients”) and their transmitting partners (“donors”). Methods Using a single genome-amplification approach, we compared quasispecies in donors and recipients on the basis of 316 and 376 C2V5 env sequences amplified from plasma viral RNA and PBMC-associated DNA, respectively. Results Both DNA and RNA sequences indicated very homogeneous viral populations in all recipients, suggesting transmission of a single variant, even in cases of recent sexually transmitted infections (STIs) in donors (n = 2) or recipients (n = 3). In all pairs, the transmitted/founder virus was derived from an infrequent variant population within the blood of the donor. The donor variant sequences most closely related to the recipient sequences were found in plasma samples in 3/8 cases and/or in PBMC samples in 6/8 cases. Although donors were exclusively (n = 4) or predominantly (n = 4) infected by CCR5-tropic (R5) strains, two recipients were infected with highly homogeneous CXCR4/dual-mixed-tropic (X4/DM) viral populations, identified in both DNA and RNA. The proportion of X4/DM quasispecies in donors was higher in cases of X4/DM than R5 HIV transmission (16.7–22.0% versus 0–2.6%), suggesting that X4/DM transmission may be associated with a threshold population of X4/DM circulating quasispecies in donors. Conclusions These suggest that a severe genetic bottleneck occurs during subtype B HIV-1 heterosexual and homosexual transmission. Sexually-transmitted/founder virus cannot be directly predicted by analysis of the donor’s quasispecies in plasma and/or PBMC. Additional studies are required to fully understand the traits that confer the capacity to transmit and establish infection, and determine the role of concomitant STIs in mitigating the genetic bottleneck in mucosal HIV transmission.


Introduction
The transmission of human immunodeficiency virus type 1 (HIV-1) and the establishment of a productive infection are complex biological processes, and the details of the mechanisms remain to be elucidated. Initial studies of sexually acquired HIV-1 infection suggested that viral populations in the acute phase are generally highly homogeneous genetically, in contrast to the more heterogeneous viral populations found in chronic infections [1][2][3][4][5][6]. These findings thus suggested that HIV-1 infection is associated with a transmission ''bottleneck''. However, more recent studies have reported heterogeneous virus populations shortly after infection in African female sex workers [7][8][9][10] and in American men who have sex with men (MSM) [11]. These observation suggest that the routes and circumstances of infection may affect the complexity of the transmitted virus [12].
The differing findings concerning the complexity of viruses during the acute and early phases of HIV-1 infection probably result from a combination of factors, including differences in the experimental designs and the methodologies used. A common approach has been to identify subjects within the first months following infection and to derive HIV sequences by bulk or nearlimiting-dilution PCR amplification of proviral DNA or plasma RNA followed by cloning, sequencing and phylogenetic analyses [1,2,7,10,12]. In 2008, Keele et al. devised a novel strategy for a more precise molecular identification and enumeration of transmitted HIV-1 genomes [13]. This method, SGA-direct amplicon sequencing, was recently applied to clinical cohorts of acutely infected individuals [13][14][15][16][17], and the findings indicated that approximately 80% of heterosexual subjects and 60% of MSM are productively infected by a single viral genome. Since most of these studies only characterized the transmitted viral population in recipients, little information was available about its relationship with virus circulating in the donor. Also, few of these studies compared the viral populations identified by analysis of both RNA and DNA samples from donor/recipient pairs.
We report a study of eight transmission pairs, each of them including a sexually-infected patient enrolled into the French ANRS PRIMO cohort at the time of primary HIV-1 infection (PHI) (''recipient'') and his/her HIV-1-infected sexually-transmitting partner (''donor''). SGA-direct amplicon sequencing in plasma RNA and PBMC-derived DNA samples was used to compare C2V5 env gene sequences of the quasispecies in the donors and recipients.

Ethics Statement
The Ethics Committee of Cochin Hospital approved the study, and all the patients gave their written informed consent.

The ANRS PRIMO Cohort
The patients were defined as having PHI from a western blot (WB) profile compatible with ongoing seroconversion (incomplete WB with absence of antibodies to pol proteins) (94% of the patients), detectable plasma HIV RNA with a negative or weakly reactive ELISA (2%), or an interval of less than 6 months between a negative and a positive ELISA result (4%) [18]. The date of infection was estimated as the date of symptom onset minus 15 days or, in asymptomatic patients, the date of incomplete WB minus 1 month, or the midpoint between a negative and a positive ELISA result. Patients were enrolled if HIV infection was estimated to have occurred less than 3 months previously. All patients were antiretroviral (ART)-naïve when enrolled into the cohort. At enrolment, blood samples were collected for immunological and virological studies. Participants completed standardized questionnaires describing HIV-acquisition risk group and sexual behavior (including number and characteristics of sexual intercourses before diagnosis of PHI and history of sexually transmitted infections (STI)). Serological screening for syphilis (Treponema pallidum hemagglutination assay (TPHA) and Venereal Diseases Research Laboratory test (VDRL)), hepatitis B (HBV) and hepatitis C viruses (HCV) was performed at enrolment. In cases of positive HBV or HCV screening results, HBV DNA and HCV RNA were quantified using the COBASHAmpliprep/ COBASHTaqManHHBV v.2.0 assay (Roche, Meylan, France) and the Abbott RealTime HCVH assay (Abbott, Rungis, France), respectively.
This study involved patients enrolled during PHI in the PRIMO cohort, who were able to identify their partner likely to be the source of their HIV infection, and agreed to participate in the substudy. They were then asked to propose to their partner to have a questionnaire and a blood sample in the month following enrolment of the recipient. Samples from donors were collected and systematically screened for syphilis, HBV and HCV at the time of collection. Between March 1998 and October 2008, 17 donor/recipients pairs were enrolled in our cohort. We herein study 8 out of these pairs, from which plasma and PBMC samples from donors were available.

Laboratory Methods
HIV-RNA was quantified with the Cobas Taqman HIV-1 v1.5 assay (Roche Diagnostics, Meylan, France) as recommended by the manufacturers (threshold of detection of 20 copies/ml). Cell associated HIV-1 DNA in whole blood samples was quantified using the real-time HIV-1 DNA assay (Biocentric, Bandol, France) with a detection limit of 5 copies/PCR. Drug resistance was evaluated by amplifying and sequencing the HIV-1 reverse transcriptase (RT) and protease genes in plasma HIV-RNA samples obtained at enrolment [19]. Resistance to nucleoside RT inhibitors, non-nucleoside RT inhibitors and protease inhibitors was defined according to the 2012 ANRS HIV-1 genotypic resistance interpretation algorithm (www. hivfrenchresistance.org).
HIV-1 subtype was determined by phylogenetic analysis of the RT and V3 env sequences, based on sequence comparisons with previously reported representatives of group M including the reference sequences of subtypes and sub-subtypes, and all the CRF sequences available in the HIV database or genbank (up to CRF54_01B) (http://www.hiv.lanl.gov).

Viral RNA Extraction and cDNA Synthesis
Viral RNA was extracted from each sample using a home-made protocol: the patient plasma sample (500 ml) was centrifuged at 16,000 rpm for 1 hour at +4uC. The supernatant was removed, and 600 ml of HCV LYS v2.0 (from the AmplicorH HCV Specimen Preparation Kit, Roche Diagnostics, Mannheim, Germany) was added, followed by an incubation during 10 minutes at room temperature. A 600-ml of volume of isopropanolol was added and RNA was precipitated at 14,300 rpm for 15 minutes at +20uC. The supernatant was removed and the RNA pellet was rinsed with 1,000 ml of 70% ethanol. The tube was centrifuged at 14,300 rpm for 5 minutes at +20uC. The supernatant was removed and the RNA pellet was resuspended in 30 ml of water. RNA was immediately subjected to first strand cDNA synthesis by the Titan One Tube RT-PCR SystemH and the Protector Rnase InhibitorH (Roche Diagnostics, Mannheim, Germany) according to the manufacturer's instructions. Each first strand synthesis reaction included 0.4 mM of the antisense primer ED12 (see sequence below). In some experiments, a different antisense primer, Env8, was used (see sequence below). The reactions were incubated at 50uC for 30 minutes.

Proviral DNA Extraction
The QIAamp DNA Mini KitH (Qiagen SA, Courtaboeuf, France) was used according to the manufacturer's instructions to extract total DNA from PBMCs from samples collected at inclusion.

DNA Sequencing
Nucleotide sequences were determined by direct sequencing according to the manufacturer's instructions (Applied Biosystems, Foster City, CA, USA). Electrophoresis and data collection were performed on an ABI 3130 Genetic Analyser sequencerH (Applied Biosystems). Individual sequence fragments for each amplicon were assembled and edited using Sequence Navigator software [23]. All chromatograms were inspected for sites of mixed bases (double peaks), which would be evidence of priming from more than one template or the introduction of errors in early cycles of PCR. Any sequence with evidence of double peaks was excluded from the subsequent analyses.

Sequence Alignments
Clustal X was used for sequence alignments [24]. Phylogenetic interrelationships among viral sequences were estimated using Neighbor-Joining trees [25], and maximum likelihood methods with BioEdit and MEGA4 integrated molecular evolutionary genetic analysis software [26,27]. The reliability of the tree topology was estimated from 1,000 bootstraps replicates.

Hypermutated Samples
Enrichment for APOBEC3G/F mutations violates the assumption of a constant mutation rate across positions, as the editing performed by these enzymes are base-and context-sensitive. Enrichment for mutations with APOBEC3G/F signatures was assessed using Hypermut 2.0 (www.hiv.lanl.gov). Sequences that yielded a p-value of 0.05 or lower were considered significantly hypermutated and excluded from subsequent analyses.

Estimation of the Date of Infection of the Donors
The date of the donors' HIV-1 infection was unknown. Two methods were used to estimate the date. First, a serum sample was sent to the French National Reference Center for HIV and tested for recent infection by EIA-RI [29,30]; this single indirect enzymelinked immunosorbent assay quantifies antibodies for TM (gp41) and V3 peptides and has been validated as being able to discriminate recently infected individuals from those with longlasting infection.
Second, we used a probabilistic modeling approach. For each data set, the env sequences were aligned using MAFFT (L-INS-i option) [31]. Sites with more than 75% gaps were removed, and we estimated the dates of infection of chronic patients as the date of the most recent common ancestor (MRCA) of his sequences, calculated using the Bayesian, MCMC-based program BEAST v1.7 [32]. We assumed a GTR+I+C4 substitution model and a strict molecular clock with a fixed substitution rate of 6610 23 substitutions per site and per year (a standard value for env, see, for example, [33]). We used a Bayesian skyride tree prior as a coalescent demographic model with time-aware smoothing [34]. MCMC simulations were run for 2610 8 chain steps with subsampling every 2610 5 . Convergence of the chains and results were inspected using Tracer v1.5. ESS values were larger than 200 for all parameters and all data sets, except for two parameters (prior and posterior) with the MRT data set. We also tested a lognormal relaxed molecular clock, but obtained poor results as nearly identical sequences were separated by large divergence times, most notably among recipient sequences. As a consequence, the infection date of the recipient was close to the infection date of the donor, thus contradicting clinical evidences.

Study Subjects
SGA-direct sequencing was used to identify and enumerate transmitted/founder env sequences in eight patients with PHI, who reported sexual exposure as their primary HIV-1 risk behavior and who denied injection drug use (Table 1). These eight patients included three women and one man infected through heterosexual exposure and four MSM. All were infected with subtype B strains without resistance to the three main classes of antiretrovirals. One patient (recipient#2) was coinfected with HBV and HCV and three others (recipients#1, 3 and 7) reported histories of STI in the 6 months preceding HIV diagnosis. At the time of the study, four of the subjects were ELISA+/WB indeterminate, one was ELISA+/WB+/p31-and three were ELISA+/WB+/p31+. Blood samples from the donors were obtained concomitantly (donor#4), 14 (donors#1-3 and 6-7) or 30 days (donors#5 and 8) after the enrolment of their respective recently infected partner ( Table 2). Envelope sequences in these samples were analyzed: the EIA-RI test suggested that the donors had been infected less than 6 months previously in two cases (donors#3 and 7), and that the other six patients had long-lasting infections. These findings were confirmed by our modeling approach: donors#3 and 7 were estimated to have been infected 0.46 and 0.20 years, respectively, prior to collection of the blood sample and the remaining donors were estimated to have been infected between 2.66 and 11.17 years previously ( Table 2). Histories of STI were not available for donors. However, the microbiological screening for coinfections was positive in five patients; the pathogens identified were syphilis (n = 2, donors #4 and 6), HBV (n = 4, #2, 3, 4 and 5) and HCV (n = 1, #5). There was no evidence of transmission of these infections to their partners.

HIV-1 env Diversity Analysis
Totals of 316 and 376 C2V5 env sequences were obtained from plasma vRNA and PBMC-associated DNA, respectively (median of 21 RNA sequences per subject; range 12-33; and 22.5 DNA sequences per subject; range . DNA env sequences could not be obtained from recipient#8 because appropriate samples were not available. A composite Neighbor-Joining phylogenetic tree was generated ( Figure 1): the viral sequences formed eight distinct donor/recipient-pair-specific monophyletic lineages, each with strong statistical support (bootstrap values .98%), indicating that neither cross-contamination from other samples nor related transmission networks have occurred. Sequences from PBMC DNA and plasma RNA were distributed throughout the branch patterns of each donor and recipient, suggesting that these two sources were not compartmentalized. In all transmission pairs, recipient env sequences were highly homogeneous forming a distinct monophyletic subcluster within the tree of donor sequences. An example of the trees of donor/recipient pairs is given in Figure 2 (pair#1).
n.d. Within-patient env diversities for the eight donors ranged from 0.16% to 7.55% (mean 3.92%) for DNA sequences, and from 0.11% to 5.32% (mean 2.44%) for RNA sequences. Again, viral diversity in both DNA (0.38%, 0.16%) and RNA sequences (0.63%, 0.11%) was low in donors#3 and 7, consistent with recent infection. The viral diversity in the six other donors was significantly higher in DNA than in RNA sequences.

HCV testing
To evaluate the extent of sequence diversity in recipients, env sequences from the recipients were examined using the Highlighter tool, which allows comparison of each recipient env sequence to a reference recipient sequence and graphically depicts all nucleotide differences between the two. An example of the output of this tool for one recipient (recipient#1) is given in Figure 3A; it shows a remarkable degree of homogeneity of the env sequences in this Figure 1. Evolutionary relationships between the HIV-1 env genes in the eight donor/recipient pairs. The evolutionary history was inferred using the Neighbor-Joining method [38]. The optimal tree with the sum of branch length = 2.01912678 is shown. The tree is drawn to scale, with branch length in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method [39] and the unit is the number of base substitutions per site. Codon positions included were 1 st +2 nd +3 rd +noncoding. All positions containing gaps and missing data were eliminated from the dataset. There were a total of 230 positions in the final dataset. Phylogenetic analyses were conducted in MEGA4 [40]. For each recipient, viruses isolated from PBMC-derived DNA ( N ) and plasma RNA (#) are represented, with a different color for each donor/recipient pair. Asterisks indicate branches with bootstrap values greater than 98%. doi:10.1371/journal.pone.0069144.g001 recipient, whose infection was estimated to have occurred 33 days prior to sample collection. Approximately 80% (47/59) of the sequences are identical despite each being amplified from a unique viral genome. Similar proportions of identical sequences were observed in sequences obtained from both plasma (32/39) and PBMC (15/20). Compared to the reference amplicon, approximately 17% of the sequences exhibited a single nucleotide change, and these changes were randomly dispersed over the C2V5 region; 2% of the sequences carried two nucleotide changes. Similar results were obtained by Highlighter analyses of the seven other recipients, consistent with single-variant transmission in all cases.

Relationships between Recipient and Donor Quasispecies
The infection in each of the eight recipients appeared to have been established from a single donor variant. We sought to determine the frequency of this variant or closely related variants within the donor quasispecies and to examine whether such variants predominated in plasma-or PBMC-derived viruses in the donor. The number of nucleotide differences between each donor virus sequence and the consensus recipient virus sequence were calculated for each of the eight transmission pairs. In all cases, at least one donor variant was identified that differed by fewer than seven nucleotides from the consensus recipient variant ( Table 4). The donor variants most closely related to the recipient sequences were found in both plasma (3/8) and PBMC (6/8) samples, such that it was not possible to establish whether the infecting virus was specifically derived from a single compartment. In the donor/ recipient pairs #3 and 7, the viral quasispecies were largely homogeneous in both donors and recipients. In the other pairs, fewer than 20% of the donor sequences contained fewer than 10 nucleotide differences to the recipient consensus sequence. This is illustrated in Figure 3B, depicting the extent of sequence diversity in the donor#1 using the Highlighter tool, using the same reference amplicon as shown in Figure 3A (an env RNA sequence isolated from recipient plasma). These findings suggest than the virus establishing the new infection is derived from an infrequent circulating variant population in the donor.
Again, the viral quasispecies identified in the eight recipients were highly homogeneous and displayed exclusive CCR5-tropism in 6/8 cases and exclusive X4/DM-tropism in 2 cases (recipients#5 and 8). The two patients harboring X4/DM viruses Figure 2. Evolutionary relationships between the HIV-1 env genes in donor/recipient pair #1. The evolutionary history was inferred using the Neighbor-Joining method [38]. The optimal tree with the sum of branch length = 0.28572580 is shown. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method [39] and the unit is the number of base substitutions per site. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated from the dataset (Complete deletion option). There were a total of 500 positions in the final dataset. Phylogenetic analyses were conducted in MEGA4 [40]. For the recipient, viruses from PBMC-derived DNA (

Discussion
We report an analysis of eight transmission pairs, each including a sexually infected patient enrolled into the French ANRS PRIMO cohort at the time of primary HIV-1 infection and his/ her donor. This provided a unique opportunity to compare epidemiologically linked virus populations in donors and recipients very close to the time of virus transmission, contrary to many previous studies which only characterized the transmitted viral populations in recipients without available virological data in donors.
Our study was based on the novel SGA-based approach involving more precise methods than those used in many previous studies for estimating the multiplicity of HIV-1 infection in recently infected individuals [3,5,7,8,[10][11][12]36,37]. We found that the viral populations, as assessed from both DNA and RNA sequences, were highly homogeneous in each of the eight sexually PHI patients. Our findings are consistent with recent studies, based on similar methods, which suggest that there is a severe  genetic bottleneck associated with sexual transmission of HIV-1 [13,14,16,17]. The observed highly homogeneous viral population in PBMC is consistent with a single variant being transmitted, which then massively fuels the cellular reservoir, rather than multiple variants being transmitted rapidly followed by virus population homogenization in the recipient [37]. We herein show that the donor variant most closely related to the strain establishing infection was infrequently found within the quasispecies present in donor blood. This result suggests that the recipient's infection is not due to the preferential transmission of a strain overrepresented among circulating quasispecies in donor. There are several possible explanations for this finding. First, the viral transmission could result from a stochastic process of a donor's variant whatever its frequency among blood quasispecies. Second, the severe genetic bottleneck which occurs during sexual transmission, involves properties of the ''mucosal barrier'' and/or selection of blood variants with properties favoring transmission, as recently suggested [16]. In both hypotheses, we cannot exclude that compartmentalization between blood and genital viral subpopulations may contribute to the selection of the transmitted/founder strain: at the time of transmission, the predominant strains may be different in the blood and genital compartments. Boeras et al recently analyzed viral envelope sequences in the blood and genital fluids of eight transmission pairs and found that the viruses establishing infection were in most cases more closely related to blood-derived variants than to the variants prevalent in the genital compartment [38]. However, a recent study in the SIV macaque model suggested that SIV sequences are intermixed between the blood and the semen at the time of peak virus replication, but that SIV replication evolves to compartmentalization in the male genital tract after peak viremia resolves [39]. Similarly, Redd et al working with Ugandan HIV-1-discordant couples showed preferential transmission of ancestral as opposed to contemporary strains circulating in the donor [40]. The authors suggested that transmitted strains may be sequestered in a longlived reservoir during the early stage of infection, such as latently infected cells of the genital tract and persist at a low level in blood and are potentially preferentially selected for subsequent transmission.
In 2010, a retrospective comparison of multivariant HIV-1 transmission among patient cohorts using new SGA-based determinations concluded that the multiplicity infection is higher in MSM than in heterosexuals (38% versus 19%) [17], consistent with previous studies using less precise methods [36]. We report an homogeneous viral population in each of eight sexually infected patients enrolled near the time of PHI, including four male-tomale, three male-to-female and one female-to-male transmissions. The small number of MSM donors included in our study and the fact than one of them (donor#7) harbored a homogeneous viral population may explain the absence of differences between MSM and heterosexuals. However, our findings are in line with a much larger study of clonal env sequences obtained from 145 patients at the time of PHI, which suggested that homosexual (versus heterosexual) transmission mode did not predict transmission of more heterogeneous founder virus populations [41].
In addition, we did not evidence any case of multiple infection in patients with STI, although 3/8 recipients reported STI in the 6 months prior to PHI diagnosis and 2/8 donors tested positive for syphilis at the time of blood collection. These findings are in line with the Rieder's study, which did not find elevated complexity of transmitted viruses in patients infected through sexual intercourse and presenting with a concomitant STI during PHI [41]. In contrast to these results, Haaland et al reported that multiple variant transmission was associated with the presence of genital inflammation or ulceration or with self-reported lower abdominal pain in 42 recently infected heterosexuals [16]; however, the authors did not find any association between multiple variant transmission and vaginal/urethral discharge, cystitis or the presence of genital inflammation or genital ulceration when analyzed as independent risk factors. Sagar et al previously suggested that the presence of genital tract infections was associated with the acquisition of multiple variants in Kenyan female sex workers [42]. However, samples from the sexual partners of these patients were not available in this study, so it is not clear whether some of these multiple infections were or were not due to successive infections rather than to concomitant transmission of multiple variants. Further studies including samples from both donors and recipients are needed to characterize the type of genital disease susceptible to increase the risk of multiple variant transmission.
The eight donors in our study were predominantly infected by R5-tropic strains, but two recipients were infected with a homogeneous X4/DM viral population, isolated in both DNA and RNA samples. Interestingly, the proportions of X4/DM viruses in the viral quasispecies in their respective donors were significantly higher than those in the other donors. There are at least two possible explanations for these findings. First, our cases of X4-tropic viral transmission may have been driven by a disproportionately higher proportion of X4 strains in the donor's genital fluids than in the blood compartment. However, this is not consistent with previous reports, which indicate that the frequency of X4/DM quasispecies in both male [43] and female [44] genital tracts are lower than in blood plasma. Second, the transmission of X4/DM variants could result from a stochastic process. This would be inconsistent with the conclusions of numerous studies which have attempted to correlate the predominance of CCR5 strains during the acute phase of infection with a biological bottleneck inherent to the genital mucosa [45,46]. However, no conclusive evidence has been provided to indicate that X4 viruses are less transmissible [47] and a recent study concluded that R5 and X4-infections may result from a stochastic process [48]. This conclusion is coherent with our results, which suggest that the transmission of X4 strains is associated with a threshold population of X4 quasispecies in donor plasma and PBMC samples.
That a single virus, derived from an infrequent variant of the donor quasispecies, establishes infection in patients confirms that a severe genetic bottleneck occurs during subtype B HIV-1 heterosexual and homosexual transmission. Additional studies are required to fully understand the traits that confer the capacity to transmit and establish infection, and determine the role of concomitant STIs in mitigating the genetic bottleneck in mucosal transmission. Such studies will be critical for guiding interventions aimed at preventing HIV-1 sexual transmission. -Valérie GARRAIT, Service de Médecine interne, Centre hospitalier intercommunal de Créteil, Créteil, France.