Transmission network and phylogenetic analysis reveal older male-centered transmission of CRF01_AE and CRF07_BC in Guangxi, China

ABSTRACT In China, the number of newly reported HIV infections in older people is increasing rapidly. However, clear information on the impact of older people on HIV transmission is limited. This study aims to reveal the local HIV transmission patterns, especially how older people affect virus transmission. Subtype analysis based on available pol sequences obtained from HIV patients revealed that CRF01_AE and CRF08_BC were predominant in patients aged <50 years, whereas CRF01_AE was predominant in older people aged ≥50 years (χ2 = 29.299, P < 0.001). A total of 25 patients (5.2%, 25/484) were identified with recent HIV infection (RHI). Transmission network analysis found 267 genetically linked individuals forming 55 clusters (2–63 individuals), including 5 large transmission clusters and 12 transmission clusters containing RHI. Bayesian phylogenetic analysis suggested that transmission events in CRF01_AE and CRF07_BC were centred on older males, while transmission events in CRF08_BC were centred on younger males. Multivariable logistic regression analysis showed that older people were more likely to cluster within networks (AOR = 2.303, 95% CI: 1.012–5.241) and that RHI was a significant factor associated with high linkage (AOR = 3.468, 95% CI: 1.315–9.146). This study provides molecular evidence that older males play a central role in the local transmission of CRF01_AE and CRF07_BC in Guangxi. Given the current widespread of CRF01_AE and CRF07_BC in Guangxi, there is a need to recommend HIV screening as part of free national medical examinations for older people to improve early detection, timely treatment, and further reduce second-generation transmission.


Introduction
The Guangxi Zhuang Autonomous Region (Guangxi) is located in southwestern China, bordering Vietnam. Since the first indigenous case of human immunodeficiency virus (HIV) infection was detected in 1996 [1], the prevalence of acquired immune deficiency syndrome (AIDS) in Guangxi has been increasing dramatically. As of October 2020, the cumulative number of people living with HIV in Guangxi exceeded 97,000, indicating a 39.5% increase since June 2011 (69,548 cases). As a result, Guangxi ranks third with 9.3% of the total reported HIV cases in China, while accounting for less than 4.0% of the national population [2]. The HIV prevalence in Guangxi is three times higher than the national average (1.5% vs. 0.45%) [3]. There are approximately 10,000 newly reported HIV cases in Guangxi each year. Initially, the HIV epidemic in Guangxi was driven by intravenous drug users (IDUs), but the principal route of HIV spread has shifted to sexual transmission (>95%) since 2006 [4]. Due to the hidden nature of sexual transmission and the barriers to intervention, which increase the likelihood of a more diffuse and generalized epidemic [5], HIV poses an increasing threat to the heterosexual population, especially older heterosexuals in rural areas.
As a developing country with a population of more than 1.4 billion, China, like many developed countries, is facing the challenge of an aging population. By 2020, China's population aged 50 years and older reached 486.58 million, accounting for 34.52% of the total population [6]. The rural population is aging significantly faster than the urban population [6]. Guangxi is a typical province that is "aging before getting wealthy". As a remote area with relatively slow economic and social development, Guangxi is experiencing an alarming rate of population aging. The demographic shift from middle adults to the elderly has taken only 10 years. According to the results of the 7th national census in 2020, the permanent resident population ≥60 years old in Guangxi accounted for 16.69% of the total population [7]. Lowering the age boundary to 50 years increases this proportion significantly. Older people are at higher risk of various chronic diseases, including AIDS, which places a heavy burden on society and families. This is a severe issue that deserves further attention.
Older HIV patients aged ≥50 years represent a unique group, distinct from the usual sexually active population (15-49 years old) [8]. Recent evidence demonstrated that older people remain sexually active and that an increasing number of older Chinese were living with HIV [9]. A previous meta-analysis noted an HIV prevalence of 1.68% in older people from 2010 to 2018 [10], considerably higher than the 0.058% prevalence found in the general population [11], further emphasizing that older people are of concern. High-risk behaviours, such as having multiple sexual partners and unprotected sex with non-spousal partners, are common among older people [12,13], putting them at higher risk of HIV infection and transmission [14]. The number of newly reported HIV cases among those aged 50 years and older has been increasing year over year [15]. In Guangxi, older people accounted for more than 40% of reported HIV cases in 2013, up from less than 4% in 2000 [4]. In particular, the proportion of older males with HIV has increased to 64.8% [16]. Older people living in rural areas, especially older heterosexual male clients of female sex workers (FSWs) from low-cost commercial sex venues were considered at higher risk of HIV infection [17][18][19].
Phylogenetic analysis and transmission networks are useful tools that can be used reliably to define closely related clusters that reflect actual transmission [20], and thus guide the development of targeted intervention strategies [21]. Previous studies in China have shown that older people play an crucial role in local HIV transmission with a limited sample size [22], and the studies did not demonstrate viral transmission among older people in large clusters or reveal the impact of new infections in older people on local HIV transmission [23]. Currently, newly reported HIV cases in Guangxi are mainly from the elderly population; however, clear information on the impact of older people on HIV transmission networks and phylogenetics is limited.
Qinzhou city, located in the south of Guangxi, has a serious HIV epidemic. The HIV molecular epidemiological and socio-demographic characteristics of people living with HIV (PLHIV) in Qinzhou are consistent with those of the whole province. Firstly, Qinzhou ranks third among Guangxi's 14 cities in terms of cumulative HIV/AIDS cases. Secondly, the distribution of HIV subtypes in Qinzhou is consistent with that in Guangxi, mainly Circulating recombinant form (CRF) 01_AE, CRF08_BC and CRF07_BC [24]. Thirdly, the main HIV transmission route in Qinzhou was heterosexual transmission (94.5%) [25], which is comparable with the data in Guangxi (>90%) [4]. Finally, the majority of HIV cases reported in Qinzhou were male (74.2%) and elderly (53.1%) [24], which is also consistent with the situation in Guangxi [4,16].
This study was designed to reveal the local transmission patterns of HIV in Qinzhou, Guangxi. We also focused specifically on the impact of older people on HIV transmission, aiming to inform the development of targeted intervention strategies.

Study population and data collection
From January 1, 2017 to December 31, 2018, 518 previously and newly reported HIV patients were recruited by convenience sampling in all four administrative counties in Qinzhou city. All subjects in this study were diagnosed between 1999 and 2018, and the majority (67.4%) were newly diagnosed between 2017 and 2018. The inclusion criteria were as follows: (1) at least 16 years old; (2) living with HIV; and (3) no previous history of antiretroviral therapy (ART).
Socio-demographic information including age, gender, ethnicity, education, occupation, marital status, transmission route, high-risk sexual behaviour, history of intravenous drug use, date of diagnosis and current place of residence was obtained through a face-to-face interview. Among these, high-risk sexual behaviour was defined as having a homosexual, commercial, or casual sexual partner. The time from diagnosis to sampling was calculated for each participant. Venous blood was extracted, and the plasma was separated and stored at −80°C. After sampling, all participants were included in treatment. Written informed consent was obtained from all participants. This study was reviewed and approved by the Human Research Committee of Guangxi Medical University (No. 20170228-21).

The maxim HIV-1 LAg-Avidity EIA test
To identify individuals who may have been recently infected at the time of sample collection, an HIV-1 Limiting Antigen Avidity (LAg-Avidity) Enzyme Immunosorbent Assay (EIA) Test (Cat. No. 92001, Maxim Biomedical, Rockville, MD, United States) was performed for samples with less than one year between confirmation of HIV infection and sampling (n = 321). A single initial screening test was performed on all plasma samples. Then, a conformation test was performed in triplicate on specimens with an initial screening normalized optical density (ODn) ≤ 2.0. The threshold value for ODn was set at 1.5, corresponding to an average seroconversion of 130 days [26]. Recent HIV infection (RHI) was determined if the median ODn value of the triplicate confirmatory tests was ≤1.5; otherwise, chronic HIV infection (CHI) was determined [27].
HIV nucleic acid extraction, amplification and sequencing HIV RNA was extracted using an automated nucleic acid extraction machine (NP968-S system) and the Tianlong RNA extraction Kit (Tianlong, Xian, China) according to the manufacturer's standard protocol. When the initial extraction was unsuccessful, RNA was extracted again using the High Pure Viral RNA Kit (Roche, Germany). The RNA was then subjected directly to nested polymerase chain reaction (PCR) with the Prime Script One Step RT-PCR Kit (Takara, Dalian, China) to generate pol gene fragments (HXB2: 2253-3464) as previously described [28]. Amplification products from PCR-positive samples were purified and sequenced. The chromatogram data were cleaned and assembled using Sequencher v5.4.6 (Gene Codes, Ann Arbor, MI). Only sequences over 900 nucleotides were retained in our analysis, because network inference for shorter sequences is inaccurate [29,30].

HIV subtype analysis
Firstly, sequences with a proportion of mixed bases greater than 5% were removed using the online tool Quality Control in the Los Alamos National Laboratory HIV Database (hereafter referred to as the HIV Database, https://www.hiv.lanl.gov). The sequences obtained were aligned with a reference dataset downloaded from the HIV Database using the online tool HIV Align and edited manually using BioEdit v7.0. Initial subtype analysis was performed with an approximately maximum likelihood (ML) phylogenetic tree constructed using the general time-reversible substitution (GTR) model in FastTree v2.2.10 and Figtree v1.4.3. Sequences that clustered with the reference sequence and with bootstrap values ≥70% were identified as the same subtype as the reference sequence. Sequences that could not be determined by the initial subtype analysis were identified using the online tool HIV BLAST. After these two rounds of subtype analysis, sequences that still could not be determined were defined as unique recombinant forms (URFs).

HIV transmission network inference
HIV transmission networks were defined by the genetic distance (GD) matrix approach and constructed using the HIV-TRAnsmission Cluster Engine (HIV-TRACE) [31], which has been widely used to construct molecular networks. We performed sensitivity analysis for different GD thresholds (ranging from 0.1% to 2.5%) to determine the optimal threshold (i.e. the GD when the maximum number of transmission clusters is detected). Finally, putative transmission links for the Tamura-Nei 93 GD ≤1.8% were inferred. Each cluster was sorted by the number of nodes in the cluster from highest to lowest and then numbered in ascending order. Large transmission clusters were defined as those containing 10 or more patients. Transmission clusters containing RHI were focused on to determine the characteristics of clusters where recent transmission event existed and the characteristics of RHI. We also ranked the degrees of all nodes within the network, and defined nodes with degrees greater than or equal to the upper quartile (i.e. degree ≥7) as having high linkage. Nodes with degrees ≥15 were defined as high-degree individuals (i.e. individuals with a high risk of transmission).

Bayesian phylogenetic analysis
To better understand the evolutionary history of HIV in Guangxi, we downloaded 418, 82, and 112 reference sequences from the HIV database for CRF01_AE, CRF08_BC and CRF07_BC, respectively. After deduplication, only reference sequences containing the sampling dates necessary for Bayesian phylogenetic analysis were included. A total of 1,079 sequences were obtained, including 675 CRF01_AE strains, 244 CRF08_BC strains, and 160 CRF07_BC strains (Table S1). Of which, 43.3% (467/1,079) were collected in Qinzhou during 2017-2018, others were downloaded from the HIV Database. Three sub-datasets were created based on the major HIV subtypes.
To reveal the role of older people in HIV transmission events in more detail, we still used 50 years as the age cut-off for older people and defined the following four age-gender subgroups for the local subjects: older male (OM), older female (OF), younger male (YM), and younger female (YF). Temporal signals were evaluated in TempEst v1.5.3 [32], and Bayesian inference was performed in BEAST v1.10.5 using the Skygrid model with an uncorrelated lognormal relaxed clock under the general time reversible (GTR) substitution model. Each Markov Chain Monte Carlo (MCMC) iteration for CRF01_AE, CRF08_BC, and CRF07_BC was run with 600, 400 and 200 million states. Every 1,000 iterations were sampled and the first 10% were discarded as burn-in. Convergence defined as an effective sample size (ESS) ≥ 200, was determined in Tracer v.1.7.2 [33]. A Bayesian Stochastic Search Variable Selection (BSSVS) procedure was used to determine the relationship between subgroups [34]. And a robust counting (Markov jumps) approach was used to calculate the expected number of viral migrations [35]. Statistical support was measured using Bayes Factors (BF) and posterior probability calculated by SpreaD3 v0.9.6 [36]. Only results with BF ≥3 and posterior probability support ≥0.9 were further discussed [37].

Statistical analysis
Descriptive statistics were performed on socio-demographic parameters comparing patients in transmission clusters, patients in transmission clusters containing RHI, patients in large transmission clusters, and patients with high linkage. Univariable and multivariable logistic regression models were applied to identify factors associated with clustering and high linkage. Multicollinearity between the independent variables was excluded using variance inflation factors (VIF, <5) and correlation coefficients (<0.8). The crude odds ratios (COR), adjusted odds ratios (AOR), and 95% confidence intervals (CI) were calculated. The Chi-square test or Fisher's exact test was used to compare the characteristics of patients with RHI or CHI in the network. All statistical analyses were performed using IBM SPSS Statistics v26.0 (IBM Corp., Armonk, NY, USA). P values were twosided with a significance level of 0.05.

Study population
Of the 518 plasma samples collected from Qinzhou ( Figure S1a), 93.4% (484/518) were sequenced successfully and included in the follow-up analysis. According to the Chinese HIV/AIDS case reporting system, the total number of newly reported HIV/ AIDS cases in Qinzhou was 481 from 2017 to 2018. Of them, we included 326 cases in our analysis, accounted for 67.7% (326/481) of the total number of newly reported HIV/AIDS cases in Qinzhou during the study period. The sampling depth meet the requirements (60%) for transmission network analysis recommended by the Chinese Center for Disease Control and Prevention (CDC).
As shown in Table 1

Identification of HIV transmission clusters
Sensitivity analysis of GD thresholds ranging from 0.1% to 2.5% indicated that a threshold of 1.8% was optimal for detecting transmission clusters ( Figure  S2). The transmission network analysis then revealed that 267 (55.2%) genetically linked individuals formed 55 clusters (2-63 individuals) (  Table S2.

Large transmission clusters
We identified five large transmission clusters, including one CRF08_BC cluster, one CRF07_BC cluster and three CRF01_AE clusters ( Figure 2). The largest cluster (i.e. cluster one) consisted of HETs (57.1%, 36/63) and IDUs (42.9%, 27/63), all subtyped as CRF08_BC. The high-degree individuals in cluster one were all IDUs who were long-term heroin users (10-22 years). The other four large transmission clusters were predominantly HETs, with 54.5% (12/22), 85.0% (17/20), 62.5% (10/16) and 38.5% (5/13) older people in cluster 2-5, respectively. High-degree individuals in cluster two were HETs aged 30-49 years, and high-degree individuals in clusters three and four were HETs aged ≥50 years. Of note, three high-degree individuals in cluster three reported that their spouses were HIV positive, and one of them also reported a history of commercial sex. A 51-year-old female patient in cluster five, degree 10, also reported her husband as HIV positive. The socio-demographic characteristics of all high-degree individuals are presented in Table S3.  Table S4.

Transmission dynamics between age-gender subgroups
Bayesian phylogenetic analysis revealed a complex history of viral migration between age-gender subgroups, supporting links between older people (especially older males) and other populations ( Figure 4 and   respectively. Transmission from OM to YF accounted for 17.91% of the total CRF07_BC migration events. Overall, we observed that OM were the main source of migration events for CRF01_AE (90.37%) and CRF07_BC (100%), and OF and YM were the corresponding destinations. Moreover, YM were the main source of CRF08_BC migration events (91.94%), and OM and YF were the corresponding destinations.

Factors associated with clustering and high linkages
There was no multicollinearity among the independent variables included in this study. Multivariable logistic regression analysis found that age, ethnicity, education and current place of residence were the most significant factors associated with being part of a transmission cluster (  CI: 0.210-0.915) and outside Qinzhou (AOR = 0.137, 95% CI: 0.041-0.452) were less likely to cluster than those living in Pubei County. The HIV transmission pattern in Qinzhou was predominantly intraregional, with a low rate of inter-regional transmission ( Figure S3b). Although RHI was not associated with clustering, all RHI patients were part of the transmission network, and 64.1% of their connections were to CHI patients, while 35.9% were to RHI patients ( Figure S3c). We defined nodes with a degree ≥7 as those with high linkage. Multivariable logistic regression analysis showed that HIV subtype, infection status and current place of residence were significant factors associated with high linkage ( In addition, RHI patients had a higher degree of linkage within the network (AOR = 3.468, 95% CI: 1.315-9.146). Further analysis revealed that a higher proportion of RHI patients were older than CHI patients (72.0% vs. 47.5%, χ 2 = 5.431, P = 0.020) ( Table 4). RHI and CHI patients also differed in terms of

Discussion
This study is the first to explore the differential impact of older people on the network connection and virus transmission of the three major CRFs through an HIV molecular epidemiological survey in Guangxi. Evidence from network and phylogenetic analysis suggests that older people, especially OM, contribute to the rapid, sustained and complex HIV epidemic in Qinzhou, informing the development of targeted prevention and intervention strategies.
In accordance with a previous study, the most prevalent HIV subtypes in Qinzhou are still dominated by CRF01_AE, CRF08_BC and CRF07_BC [25]. This may reflect the ongoing and sustained transmission of HIV derived from already-circulating local strains [38]. Moreover, the prevalence of CRF01_AE and CRF07_BC was higher in older people than in patients under 50 years of age. This discrepancy may be related to the transmission characteristics of different HIV strains. Previous studies in China found an increased prevalence of CRF01_AE and CRF07_BC in sexually transmitted HIV populations [39,40]. Especially in Guangxi, CRF01_AE predominated in the sexual transmission of HIV [41]. Moreover, due to the high proportion of X4 tropism [42,43], CRF01_AE tended to be associated with rapid disease progression and advanced immunodeficiency [44]. Even with combination ART, poor immune recovery was more frequent with CRF01_AE [45]. Older people tend to be in poor health and vulnerable to multiple diseases, and improving health-related quality of life is more important than just prolonging life. Therefore, it is necessary to enhance comprehensive treatment, care services and disease progression monitoring for older HIV patients.
In this study, we found a relatively low rate of new infections (5.2%), which may be associated with the high late diagnosis rate in Guangxi (70.2%) [46]. More critically, the rate of late diagnosis was much higher in patients aged 50 years and older [47]. In this study, 33.7% (163/484) of subjects had been diagnosed for more than 1 year, and 41.5% (201/484) were aged ≥50 years. Therefore, it is not surprising to find such a low rate of new infections. Our study found a higher linkage for patients with RHI. The most important reason may be that HIV-infected patients are more likely to transmit the virus in the early stage (acute or recent) of infection, when viral loads are high and opportunities for intervention are limited because they are often unaware of their status [48,49]. Higher linkages were also found in CRF07_BC compared to CRF01_AE, which may be related to the rapid expansion of CRF07_BC in recent years among sexually transmitted populations in China due to its slow disease progression, decreased virulence and enhanced transmissibility [50,51].
We found that older people were more likely to be present in transmission clusters, suggesting that they contribute significantly to local HIV transmission and should be monitored as a priority. The psychological and emotional needs of older people are magnified by the lack of companionship and care, and many of them turn to extramarital sexual relationships (such as with FSWs) for stimulation. There is already genetic evidence of an HIV transmission linkage between older male clients and FSWs [52]. Older male clients of low-cost commercial sex venues in rural Guangxi were at higher risk of HIV infection, especially those without stable sexual partners [18,19]. Additionally, a higher proportion of older male clients have numerous sexual partners and use condoms less frequently, making them an important bridge for HIV transmission from FSWs to low-risk groups [53]. Therefore, access to and use of condoms, as well as screening for HIV infection among older people, especially older males and their sexual partners, should be promoted in the future. In addition, the sources of transmission events inferred from Bayesian phylogenetic analysis in this study were all male. Specifically, we found that transmission events in CRF01_AE and CRF07_BC were OM-centred, whereas transmission events in CRF08_BC were YM-centred. Previous studies found that CRF01_AE and CRF07_BC were prevalent in sexually transmitted populations [54,55], with CRF01_AE being the most prominent subtype in Guangxi and dominating the HIV epidemic in heterosexuals [41]. In recent years, there has been a rapid increase in the number of HIV infections in older people (mainly male) through sexual contact. As an emerging high-risk group, OM could easily transmit the virus to OF through heterosexual commercial sex, and some of them might transmit HIV to YM through homosexual contact or indirect bridge people. However, CRF08_BC originated in Yunnan Province and was prevalent among both IDUs and heterosexuals [56,57]. In this study, approximately 40.0% of CRF08_BC in Qinzhou occurred among IDUs with a predominantly YM population, and the risk of transmission was high [58,59]. Therefore, CRF08_BC is easily transmitted from the YM population to other subpopulations. For example, YM may transmit the virus to OM through intravenous drug use or bridge people, or to YF through heterosexual contact. These findings indicate that the male population, especially OM, plays a key role in HIV transmission in Guangxi. Improved HIV detection is urgently needed to facilitate early treatment and effective prevention of second-generation transmission.
In this study, a total of five large transmission clusters were identified, with cluster one being the largest. Most of the individuals in this cluster were CHI middle-aged IDUs, and only a few were RHI heterosexual older people. In addition, all of the high-degree individuals were long-term heroin-using IDUs (10-22 years). These results suggested that HIV was still rampant in the middle-aged population in Qinzhou through intravenous drug use and sexual contact, and was gradually being transmitted to older people through sexual contact [57]. The other four large clusters were heterosexual clusters, with most nodes coming from the elderly population. In particular, the high-degree individuals and RHI patients in cluster three were all elderly. Moreover, 40.0% of individuals in this cluster reported that their spouses were HIV positive, indicating a higher risk of HIV transmission among older couples. A cohort study in Tanzania found that sero-discordant couples had much higher rates of HIV seroconversion within marriage than the general population, and that female spouses were at higher risk of acquiring HIV from their husbands [60]. However, with ART, the estimated risk of transmission will be greatly reduced [61]. Therefore, it is extremely essential to increase ART among infected elderly patients.
Patients with RHI are at a higher risk of transmission due to their high viral load in the early stages of infection and therefore they can represent recent transmission events to some extent [49]. The Chinese CDC's guidelines for HIV Transmission Network recommend that clusters containing RHI should be considered as priority clusters for surveillance and intervention. It is therefore essential to identify RHI in newly diagnosed patients. This study identified 25 RHI patients who had acquired HIV through heterosexual contact, consistent with a previous study [4]. Also, older people accounted for 72.0% of the RHI, validating the fact that they are a high-risk group for new infections [16]. To make matters worse, HIV infections in the elderly population may be underestimated due to lack of motivation to seek testing [62] and late diagnosis [63]. Also, due to the commercial heterosexual sex between older males and FSW [52], aggregated clusters could easily form, which also explained the higher linkage of recently infected patients. Of the 12 transmission clusters containing RHI, cluster six was dominated by recently infected elderly patients, which requires targeted intervention and ongoing monitoring. The fact that RHI patients were linked within the network and associated with high linkage, suggests that RHI patients play a key role in local HIV transmission and should be the primary target of interventions. For example, strengthening treatment to achieve and maintain an undetectable viral load may reduce the risk of sexually transmitted HIV.
This study has several limitations. First, our results represent only the HIV epidemic in Qinzhou. Although new infections in older people are also increasing in different regions, the circumstances may differ from region to region. Therefore, future studies should pay attention to the contribution of older people to HIV transmission in a larger geographic context. Second, due to the shortcomings of cross-sectional studies, we were unable to assess the dynamics of HIV transmission. Long-term observation studies are needed to gain more information on HIV transmission patterns. Third, the participants included in this study were previously or newly reported cases, not newly infected individuals. Also, not all HIV cases could be included in our analysis, which may have led to an incomplete transmission network. Future studies should increase sampling efforts where possible. Finally, the reliability of the relevant statistical analysis may be limited by the small number of RHI identified in this study. We will collect more samples and use a larger sample size in future studies to improve the reliability of the statistical results.
In summary, our study reveals a complex HIV epidemic in Qinzhou, Guangxi. We found evidence that older people play a crucial role in the local transmission of HIV. Older people are more likely to be connected within networks. More importantly, OM was the primary source of transmission events for CRF01_AE and CRF07_BC. This suggests that HIV screening should be integrated into free medical examinations for older people nationwide in order to improve early detection and establish effective links with comprehensive treatment and care services. The government should provide more information materials, such as videos and television commercials, to promote HIV and STD testing among older people who engage in high-risk behaviours. Priority should also be given to identifying transmission hotspots and designing targeted interventions to reduce second-generation transmission.