Increased predominance of HIV-1 CRF01_AE and its recombinants in the Philippines

The growth rate of new HIV infections in the Philippines was the fastest of any countries in the Asia-Pacific region between 2010 and 2016. To date, HIV-1 subtyping results in the Philippines have been determined by characterizing only partial viral genome sequences. It is not known whether recombination occurs in the majority of unsequenced genome regions. Near-full-length genome (NFLG) sequences were obtained by amplifying two overlapping half genomes from plasma samples collected between 2015 and 2017 from 23 newly diagnosed infected individuals in the Philippines. Phylogenetic analysis showed that the newly characterized sequences were CRF01_AE (14), subtype B (3), CRF01/B recombinants (5) and a CRF01/CRF07/B recombinant (1). All 14 CRF01_AE formed a tight cluster, suggesting that they were derived from a single introduction. The time to the most recent common ancestor (tMRCA) for CRF01_AE in the Philippines was 1995 (1992–1998), about 10–15 years later than that of CRF01_AE in China and Thailand. All five CRF01/B recombinants showed distinct recombination patterns, suggesting ongoing recombination between the two predominant circulating viruses. The identification of partial CRF07_BC sequences in one CRF01/CRF07/B recombinant, not reported previously in the Philippines, indicated that CRF07_BC may have been recently introduced into that country from China, where CRF07_BC is prevalent. Our results show that the major epidemic strains may have shifted to an increased predominance of CRF01_AE and its recombinants, and that other genotypes such as CRF07_BC may have been introduced into the Philippines.


INTRODUCTION
Global efforts to strengthen HIV prevention and treatment programmes have reduced the transmission of HIV. However, whereas the growth in the number of HIV infections is decreased in many countries, Philippines has the fastest growth rate in the Asia-Pacific region, up by 200 % from 2010 to 2016 [1]. One of the main reasons for the sharp increase in the number of HIV infections in the Philippines is most likely the inadequate education and health promotion policy provided to the population, especially to the key at-risk populations: men who have sex with men (MSM), transgender women who have sex with men (TGW) and injection drug use (IDU) [2]. In 2016, 83 % of new HIV-1infection cases were among MSM and TGW, most of whom were aged between 15 and 24 years [2]. The Philippines is facing a huge challenge to the fight against HIV [2,3].
Since the first patient with AIDS in the Philippines was reported in 1984 [4], several HVI-1 subtypes (B, C, D and G), circulating recombinant forms (CRFs: CRF01_AE and CRF02_AG) and unique recombinants (01B and others) have been reported in that country [4][5][6][7][8]. These early studies showed that subtype B was the most prevalent HIV-1 strain (70 %) followed by CRF01_AE (20 %), while others accounted for smaller percentages. However, one recent study of pol gene sequences showed that CRF01_AE has become predominant (77 %) while the proportion of subtype B has decreased (22 %) [9]. All previous molecular epidemic surveys were carried out based on analysis of partial gag, pol or env sequences. Thus, the distribution of subtypes or CRFs in the Philippines may not be accurately accounted for, since the larger portion of the viral genome was not analysed. Thus, it is important to characterize HIV-1 whole-genome sequences to better understand whether, in the Philippines, new recombinants have been generated and become prevalent strains.
To better understand what viruses are circulating in the Philippines, we analysed near-full-length genome (NFLG) sequences from 23 HIV-1-infected individuals. Genetic analyses showed that CRF01_AE was predominant (61 %) and unique recombinants accounted for 26 %, while subtype B comprised only 13 % of the virus population involved. Our results indicate that CRF01_AE has become predominant, and its recombination with other circulating strains are increasing in frequency in the Philippines.

METHODS Participants
Patients newly diagnosed with HIV-1 infection in Medical City, which is an 800-bed hospital with an established Department of Health-accredited HIV treatment clinic, located in the National Capital Region of the Philippines, were invited to participate in this study during their first clinic visit in the period 2015-2017. All study participants, except one, were single Filipino males ages 22-42 years (mean age 29.21 years ±SD 5.33), from the following provinces: Bulacan (2), Capiz (1), Cavite (1), Cebu (1), Laguna (2) and Rizal (1); and from the following cities: Makati (2), Malabon (1), Mandaluyong (2), Manila (2), Pasig (3) and Quezon City (5). All patients were treatment-naive at the time of recruitment. Eighteen (78 %) reported homosexual transmission. All patients denied use of intravenous drugs. The mean CD4 count was 294.17±180.37 ml -1 , with seven (30 %) having a CD4 count <200 ml -1 . Plasma samples were collected from 23 subjects. Written informed consent was obtained from all participants. The study was approved by The Medical City Institutional Review Board and by the Duke University Institutional Review Board.
Sequence analysis PCR amplicons were quantified using qPCR with the KAPA Library Quantification Kit Illumina platform (Kapa Biosystems, Wilmington, MA). The PCR amplicon from each sample was barcoded and then sequenced on MiSeq (Illumina, San Diego, CA) using the MiSeq Reagent Nano kit v2 (300 bp). The average coverage per base was 500-8000. The final consensus sequence from each library was obtained by assembling raw sequence reads using either Geneious software (Biomatters, Auckland, New Zealand) or High-performance Integrated Virtual Environment (HIVE) [11].
The final sequences were aligned together with subtype reference sequences from the Los Alamos HIV Sequence Database (www.hiv.lanl.gov) using CLUSTAL W [12], and manual adjustment for optimal alignment was done using SEA-VIEW. Subtypes of newly characterized HIV-1 genomes were determined by phylogenetic tree analysis using the neighbour-joining (NJ) method with the Kimura twoparameter model [13,14], and the reliability of topologies was estimated by bootstrap analysis with 1000 replicates. Recombination patterns in newly characterized HIV-1 genomes were initially analysed by the jumping profile Hidden Markov Model (jpHMM; http://jphmm.gobics.de/ submission_hiv.html) [15]. The recombination breakpoints were confirmed by BootScan implemented in Simplot version 3.5.1 [16]. The recombination pattern of each virus was illustrated using RecDraw [17].

Molecular evolution clock analysis
The divergence times for CRF01_AE were estimated using the Bayesian Markov chain Monte Carlo (MCMC) approach available in the package BEAST v1.8.2. The relaxed (uncorrelated log-normal) molecular clocks were enforced under the HKY nucleotide substitution models [18], with a gamma-distribution model of among-site rate heterogeneity (with four rate categories) [19]. Each MCMC analysis was run for 50 million steps and sampled every 10 000 states. Posterior probabilities were calculated with a 10 % burn-in and checked for convergence using Tracer v1.6. The maximum clade credibility tree was generated using Tree Annotator v1. 8

Genotypic analysis of drug resistance mutations
The raw sequence reads generated from MiSeq (Illumina, San Diego, CA) were uploaded to the HyDRA website [21]. All HIV drug resistance (HIVDR) mutations found in the pol genesprotease (PR), reverse transcriptase (RT) and integrase (IN) -are reported according to classifications outlined in the Stanford HIV Drug Resistance Database (https://hivdb.stanford.edu/) [22].

Nucleotide sequence accession numbers
The GenBank accession numbers for the newly characterized sequences are MH327744-MH327766.

Determination of infection stages
Fiebig stages of HIV-1 infection were determined based on the detection of viral genomes and HIV-1-specific antibodies in plasma as previously described [23]. Three samples were collected at Fiebig stage IV, two at Fiebig stage V, and 18 at Fiebig VI (Table 1). Recent ( 130 days) and long-term (>130 days) infection stages of these samples were also determined by limiting-antigen avidity (LAg) assay [24]. Seventeen were long-term infections (LT) while five were recent infections (

Predominant CRF01_AE sequences are monophyletic in the Philippines
The NFLG sequences were obtained from 22 plasma samples by amplifying two overlapping half genomes. For the remaining sample (1008), which was negative for PCR amplification, the virus was isolated by PBMC co-culture from plasma. The NFLG sequence was obtained from viruses in cell culture supernatants by PCR amplification of two overlapping half genomes. The initial phylogenetic analysis of 23 near-full-length genome sequences showed that 20 newly characterized sequences clustered to the CRF01_AE references sequences, while three other sequences clustered closely to subtype B references sequences (Fig. 1).
Some of the sequences either clustered far outside of the CRF01_AE clade or had longer branches than others (Fig. 1). To investigate whether such sequences were the result of recombination among different subtypes, we performed recombination analysis of these sequences using the tools jpHMM and BootScan. This further analysis showed that these sequences were indeed recombinants: five (22 %) CRF01/B recombinants and one (4 %) CRF01/CRF07/B recombinant, while 14 (61 %) were CRF01_AE and three (13 %) were subtype B (Fig. 2).
Interestingly, all 14 CRF01_AE sequences formed a tight cluster, indicating that these might have been derived from the same common CRF01_AE ancestor (Fig. 1). Compared to new CRF01_AE sequences, new subtype B sequences were more divergent and intermingled with subtype B reference sequences, suggesting that subtype B viruses in the Philippines were probably derived from multiple ancestors (Fig. 1). To further confirm our observation, we constructed a phylogenetic tree with our newly characterized sequences and hundreds of available NFLG CRF01_AE, subtype B, CRF01/B and CRF01/B/C sequences from the HIV sequences database. All new CRF01_AE sequences, together with four (1005, 1009, 1011 and 1030) CRF01/B recombinants, which contained only small subtype B portions, formed a tight cluster (Fig. 3). One CRF01_AE sequence from Japan clustered with all the newly characterized sequences. In contrast, all three new subtype B sequences remained intermingled with subtype B reference sequences.
An additional 691 partial sequences from the Philippines were available from the HIV sequence database. We next sought to investigate how the newly characterized sequences related to these partial sequences. Two phylogenetic trees were constructed for the partial pol and env sequences (Fig. 4). Similar to the previous analysis, all partial pol and env sequences of the new CRF01_AE viruses still formed a tight cluster together with sequences previously reported from the Philippines (Fig. 4a), or per se (Fig. 4b). Three new subtype B sequences were intermingled with subtype B reference sequences and were as divergent as previously reported partial pol and env sequences (Fig. 4). Taken together, new CRF01_AE sequences from this study appear to form a closely related cluster among CRF01_AE sequences, together with, or without, previously reported CRF01_AE sequences, suggesting that they share the same most recent common ancestor (MRCA), while other previously reported CRF01_AE sequences formed a distinct cluster, suggesting that they were decendents from another MRCA. New subtype B sequences were intermingled with other subtype B sequences, indicating that they were the results of multiple introductions into the Philippines.
Ongoing extensive recombinants between CRF01_AE and subtype B Detailed recombination analysis of six recombinant NFLG sequences showed that five CRF01/B recombinants had distinct recombination patterns between CRF01_AE and subtype B (Fig. 2). Among 18 recombination breakpoints, only two sites were shared among three viruses (1001, 1005 and 1011), suggesting that these recombinants were newly generated and had not yet spread as widely as circulating strains in the population.
The NFLG sequence from 1023 was a complicated recombinant among CRF01_AE, C and subtype B in the initial analysis, with six fragments from CRF01_AE, three from subtype C and two from subtype B (Fig. 2). While including the CRF07_BC and CRF08_BC reference sequences, which are exclusively predominant in China [25], all three subtype C region sequences clustered tightly only with CRF07_BC sequences and not with CRF08_BC sequences or pure subtype C sequences (Fig. 5). This indicates that all three region sequences were specifically derived from CRF07_BC viruses rather than from other subtype C viruses. Taken together, these results show that a high percentage (26 %) of NFLG sequences were recombinants, and extensive recombination had been ongoing between CRF01_AE and other genotype viruses.

Timing of the introduction of CRF01_AE in the Philippines
To estimate the timing of the introduction of CRF01_AE viruses in the Philippines, we generated a maximum clade credibility (MCC) tree with NFLG sequences of 14 CRF01_AE sequences from this study, 16 CRF01_AE Fig. 4. Phylogenetic analysis of partial pol and env gene sequences. Partial pol (left) and env (right) sequences available from the Philippines in the HIV sequence database were analysed, together with the newly characterized NFLG sequences and reference sequences. Since the sequences in each region were from different studies and did not fully overlap, neighbour-joining trees were constructed to include as many sequences as possible and to maximally utilize the sequence length for all available sequences. The CRF01_AE sequences from this study and from others are indicated in red and blue, respectively. Reference sequences are shown in black.

Prevalence of drug-resistant viruses in the Philippines
Analysis of NFLG sequences also allowed us to estimate the frequency of drug-resistant viruses in the Philippines. After the raw reads were analysed using HyDRA [21], we detected four major drug-resistant mutations (DRMs) in four patients (1005, 1008, 1022 and 1023) ( Table 2). The prevalence of drug-resistant viruses (17 %, 4 of 23 samples) in these samples from the Philippines is relatively high compared to that in other countries [27,28]. The highest percentage of the DRM in the viral population was 12.7 % for M184I in 1005. The percentages of the three other DRMs were very lowfrom 2.2 to 2.6 %. All those DRMs were present in <20 % of the viral population and were probably undetectable by the conventional Sanger population sequencing method [29,30]. Multiple DRMs were not detected in any participants, and no DRMs to integrase inhibitors were detected in any participant.

DISCUSSION
Analysis of NFLG sequences from 23 HIV-1-infected individuals in the Philippines showed that CRF01_AE was the most predominant (61 %), while subtype B accounted for only 13 % of the virus population. We also found a high percentage (26 %) of recombinants. This is significantly different from what was previously reported [5][6][7][8], which showed 70 % subtype B, 20 % CRF01_AE and only small percentages of other recombinants. Although our study sample size was relatively small, the high percentage of CRF01_AE and low percentage of subtype B indicate a sharp change in the genotype distribution in the Philippines. Our results are in agreement with a more recent study which showed CRF01_AE at 77 % and subtype B at 22 % by analysing only a partial pol gene sequence [9], suggesting a dramatic shift is  CRF01_AE has spread widely through sexual transmission among MSMs. The significantly higher percentage of recombinants in this study than those previously reported [4,8,9] demonstrates that the proportion of recombinants in a region can be significantly underestimated using only partial genome sequences [31][32][33][34]. An accurate distribution of HIV-1 subtypes, CRFs and unique recombinants can only be reliably estimated by NFLG sequences.
All new CRF01/B recombinants had unique recombination patterns, and only a few recombinant breakpoints were shared among the six recombinants. This indicates that a high level of recombination is ongoing between these two predominant co-circulating genotypes (CRF01_AE and subtype B) in the Philippines. Interestingly, one virus (1023) from an MSM participant was a recombinant among CRF01_AE, CRF07_BC and subtype B. CRF07_BC originated from southwestern China [35] and has quickly become one of the main circulating CRFs and subtypes in that country [36]. CRF07_BC had not previously been identified in the Philippines. The identification of CRF07_BC-like sequences in three regions in the viral genome of 1023 strongly suggests that CRF07_BC was introduced into the country but is present at a level too low to be detected.
The tight cluster of CRF01_AE NFLG sequences suggested that these were the result of a single introduction and evolved into a unique viral population in the Philippines after its introduction. Molecular evolution clock analysis of NFLG sequences showed that these CRF01_AE viruses were probably introduced into the Philippines around 1995. CRF01_AE originated in Central Africa [37], but was found to be most prevalent in Asian countries [38][39][40]. CRF01_AE was introduced into the Philippines about 10-15 years later than it was into other Asian countries including Thailand (late 1970s) [41], China (mid-1980s) [26] and Vietnam (early 1980s) [40]. One of the reasons may be because that the Philippines is geographically isolated from other neighbouring countries, leading to a relatively later introduction. Interestingly, one CRF01_AE virus from Japan falls into the same tight cluster as those from the Philippines (Fig. 3). However, it is unclear by which routes that these viruses were transmitted from between countries due to the lack of epidemiological information.
Four major DRMs were found in four individuals, accounting for 17 % (4/23) of 23 treatment-naïve HIV-1infected individuals in this study. This prevalence is much higher than those in other Asian countries such as China (3.8 %) [27] and Thailand (2.0 %) [28], but is at the high end of the prevalence scale (13.5 -20 %) in Western Europe and North America [42,43]. One of the probable reasons for such a high prevalence in the treatment-naïve population of the Philippines is that the lower-frequency DRMs in the samples were easily detected by NGS. Three of four DRMs were present at~2 %, while the fourth was present only at 12.7 %. Compared to conventional Sanger sequencing, which generally detects only those DRMs present at >20 % in the viral population [44][45][46], NGS can detect as low a level as~1 % of DRMs [47]. This further confirms the importance of detection or monitoring DRMs using more sensitive methods.
The results from this study underscore the importance of NFLG sequence analysis in determination of the distribution of HIV-1 genotypes across diverse geographic regions for accurate detection of recombination patterns in recombinant HIV-1 genomes. The extensive recombination and marked increase of recombinants in a population will significantly increase the complexity of genetic variation [48], which may have important implications in vaccine development and patient treatment. Understanding dramatic shifts among HIV-1 subtypes, CRFs and unique recombinants, as well as the prevalence of drug-resistant viruses in a population, will be important for better epidemic control, development of effective vaccines and better treatment of HIV-1infected individuals