HIV-1 molecular transmission network among sexually transmitted populations in Liaoning Province, China

Abstract Introduction: In recent years, with the development of molecular epidemiology, molecular transmission networks based on evolutionary theory and sequence analysis have been widely used in research on human immunodeficiency virus (HIV)-1 transmission dynamics and precise intervention for high-risk populations. The HIV-1 molecular transmission network is a new method to study the population's access to the network, the characteristics of clustering, and the characteristics of interconnection in the network. Here, we analyzed the characteristics of the HIV-1 molecular transmission network of sexually transmitted people in Liaoning Province. Methods: A study of HIV-infected persons who were sexually transmitted in Liaoning Province from 2003 to 2019. HIV-1 RNA was extracted, amplified and sequenced, and a phylogenetic tree was constructed to determine the subtype using the well matched pol gene region sequence. The gene distance between sequences was calculated, the threshold was determined, and the molecular transmission network was constructed. Results: 109 samples of pol gene region were obtained. The main subtype of HIV-1 was CRF01_AE, followed by B, CRF07_BC, etc. 12.8% of them were resistant to HIV. At the threshold of 0.55 gene distance, 60.6% of them entered the HIV-1 molecular transmission network. Workers, sample source voluntary counseling and testing, other testing, subtype B and drug resistance are the factors influencing the access to HIV-1 molecular transmission network. The subtype of CRF01_AE formed 6 clusters in the molecular transmission network. In the network, the difference of connection degree between different subtypes was statistically significant. Discussion: The three subtypes CRF01_AE, CRF07_BC and B that enter the molecular transmission network do not have interconnections, and they form clusters with each other. It shows that the risk of transmission among the three subtypes is less than the risk of transmission within each subtype. The factors affecting HIV-1 entry into the molecular transmission network were occupation, sample source, genotype and drug resistance. The L33F mutation at the HIV-1 resistance mutation site constitutes the interconnection in the largest transmission cluster in the network. The epidemiological characteristics of HIV-infected persons in each molecular transmission cluster show that 97% of the study subjects come from the same area and have a certain spatial aggregation. Conclusion: Constructing a molecular transmission network and conducting long-term monitoring, while taking targeted measures to block the spread of HIV can achieve precise prevention and control.


Introduction
The Human Immunodeficiency Virus/Acquired Immune Deficiency Syndrome (HIV/ acquired immune deficiency syndrome) epidemic has become one of the most critical issues that seriously affect public health. Since the start of the epidemic, around 75 million people have been infected with HIV, and in 2018, 37.9 million people were living with HIV worldwide. [1] By the end of October 2019, approximately 958 thousand people lived with HIV in China. [2] In recent years, with the development of molecular epidemiology, molecular transmission networks based on evolutionary theory and sequence analysis have been widely used in the study of HIV-1 transmission kinetics and precision intervention in high-risk populations, [3][4][5] especially HIV -1 Long-term monitoring of drug resistance and real-time prevention interventions. [6,7] The HIV-1 molecular transmission network refers to a group of sequences that are not randomly gathered and have a certain epidemiological correlation. It constructs a transmission network through the genetic information of people infected with HIV through similar viruses with similarities and connections, and restores the macro social network of infected people as much as possible, which aims to focus on analyzing the characteristics of infected people, and preventing and controlling the active and critical groups in the network. [8,9] HIV is transmitted through networks formed by closely connected individuals who engage in injecting or sexual behaviours. [10,11] However, the research on HIV-1 molecular cluster-based transmission network in China is still in its infancy, and only some regions have relevant reports. [12][13][14] Based on traditional epidemiological methods such as questionnaires, peer tracking and disease surveillance, molecular transmission network analysis provides a new method for the analysis of HIV-1 network characteristics at the population level. [15] HIV-1 molecular transmission network can reflect the social transmission network of HIV-1 to a certain extent according to the network access, clustering characteristics and interconnection characteristics of the study population. Promotion and analysis of molecular cluster-based communication network analysis techniques and strategies are mainly to further contribute to traditional social communication network analysis. [16] In this study, we selected the sexually transmitted HIV-infected persons in Liaoning Province, using molecular epidemiological analysis, supplemented by field epidemiological data, to reveal the distribution of HIV-1 subtypes and molecular transmission network in Liaoning Province, China.

Study subjects
In our study, patients who were diagnosed with HIV between 2003 and 2019 were selected from the " acquired immune deficiency syndrome integrated prevention and treatment information system" of the Chinese Center for Disease Control and Prevention. The present address of these HIV-infected persons is Liaoning Province. 117 of them had not received antiviral treatment. From these individuals, 109 sexually transmitted HIV-infected individuals were screened for the study. Demographic and behavioral data were collected for these 109 individuals, while their blood samples were collected. All of these subjects signed informed consent. CD4 + T lymphocyte cell counts were completed within 24 hours from the collected whole blood samples. The plasma was then separated at 1500 rpm for 15 minutes using a Beckman-Kurt centrifuge. The plasma was packed and frozen at À70°C. The viral load in plasma was determined using Roche TaqMan 48 and matched HIV viral load detection kit. HIV-1 RNA was extracted from plasma using QIAampRNA MiniKit reagent from QIAGEN. Nested reverse transcription fluorescence quantitative polymerase chain reaction amplification was performed on 1.2 KB long fragment of pol region gene (1-99 codons in protease region and 1-250 codons in reverse transcriptase region) by Inhouse method. The amplified products were identified by agarose gel electrophoresis imaging. The amplified positive products were sent to Beijing Bomeid Gene Technology Co. Ltd. for purification and gene sequencing. Sanger sequencing method was used. Sequencing primers refer to "HIV resistance monitoring strategy and detection technology." [17] 2.2.2. Sequence analysis. The measured sequence results were spliced and cleaned with the software Chromas1.62. Sequences were corrected with BioEdit 7.0. And compared with the international reference sequence (Los Alamos National Laboratory HIV Sequence Database). The aligned sequences were used to construct phylogenetic trees using the neighbor-joining method in MEGA7.0. The corresponding gene subtypes were determined by clustering with international reference strains. Using the online analysis tool COMET (https://comet.1ih.lu/ index.php) Review. The aligned sequences were imported into the HIV resistance database of Stanford University (USA). http:// hivdb.Stanford.edu) Online analysis of resistance mutation sites.

Molecular propagation network analysis.
The aligned sequences were imported into MEGA7.0 software. Gene distances between all pairs were calculated using the Tamura-Nei93 model. The Bootstrap value of branch nodes of phylogenetic tree species is ≥90%, the number of samples within the cluster is ≥2, and the average gene distance within the cluster is less than or equal to the set threshold, which is defined as a molecular propagation cluster. [18] By observing the total number of propagation clusters in the propagation network under a series of thresholds, it is found that the total number of propagation clusters in the network reaches a peak (9) when the threshold (i.e., the gene threshold when the propagation cluster is the largest and the sample size included is the largest) is 0.55. Therefore, a molecular propagation network was constructed using Cytoscape 3.7.2 software with 0.55 as the threshold to identify potential propagation partners. The degree is the degree of association. Represents the number of edge connections between a node and other nodes in a molecular propagation network. Access rate is the percentage of the total number of people entering the molecular transmission network.

Statistical analysis
Statistical analysis was performed using SPSS19.0 and Excel 2010. Count data are expressed in frequency. x 2 test was used to analyze the correlation between two categorical variables. Univariate and multivariate logistic regression models were used to analyze the influencing factors of the access rate. P < .05 indicated that the difference was statistically significant.

HIV-1 Molecular Transmission Rate and Its Influencing Factors
Under the 0.55 gene distance threshold, a total of 66 series entered the HIV-1 molecular propagation network, with a total access rate of 60.6% (Fig. 2).
Whether to enter the HIV-1 molecular transmission network is taken as the dependent variable. Univariate and multivariate logistic regression analysis was performed with sex, age, ethnicity, educational level, marital status, residence, occupation, route of infection, source of samples, genotype and drug resistance as independent variables ( Table 3). The results showed that worker, sample source testing consultation and other visiter testing, subtype B and drug resistance were the influencing factors for entering the HIV-1 molecular transmission network.

Characteristics of HIV-1 molecular transmission network
The CRF01_AE subtype forms six propagation clusters in the molecular propagation network, one of them is the largest propagation cluster in the network, containing 40 nodes. Subtype B forms two propagation clusters, both consisting of two nodes. The CRF07_BC subtype forms a propagation cluster consisting of five nodes. In the network, the median connectivity of CRF01_AE subtypes was 3 degrees (IQR, 2-7). The median connectivity of subtype B is 1 degree (IQR, 1-1), and the median connectivity of CRF07_BC subtype is 2 degrees (IQR, 2-4), the degree of connectivity between different subtypes was statistically significant (P < .01). The HIV-1 resistance mutation that enters the network is protease inhibitor-related resistance, which contains 3 nodes, and the mutation sites are L33F and Q58E. In the network, 77.3% had homosexual transmission, 3 degrees of connectivity (IQR, 1-6), 22.7% had heterosexual transmission, and 3 degrees of connectivity (IQR, 2-6). The place of residence is 83.3% in the local area and 16.7% in the field, and the connectivity is 3 degrees (IQR, 1-6).
HIV-infected individuals in the largest molecular transmission cluster are all CRF01_AE subtypes. The genetic distance of the gene was 0.010 ± 0.001. Among them, 27 cases were homosexual transmission and 13 cases were heterosexual transmission. The present address is from 6 urban areas of Shenyang. The second molecular transmission cluster HIV-infected patients were also CRF01_AE subtype. The genetic distance of the gene was 0.012 ± 0.002. They are all homosexual transmission, and their current address is Shenyang City. The characteristics of HIV-infected   Table 4.

Discussion
In this study, the gene sequences and related data of 109 HIVinfected persons sexually transmitted in Liaoning Province from 2003 to 2019 were analyzed. The major genotype was identified as CRF01_AE (76.1%), followed by subtype B (10.1%) and CRF07_BC subtype (7.3%). In this study, a molecular transmission network was constructed according to the gene distance between two sequences calculated by the Tamura-Nei93 model. The Tamura-Nei93 model simulates the conversion and transfer rates of nucleotides and can correct the deviation of HIV substitution and the inconsistency of base composition, taking into account the calculation speed and biological authenticity comprehensively. [19] In this study, we attempted to determine the threshold of gene distance by discriminating the transmission clusters among the sexually transmitted population in Liaoning Province. Under the threshold of 0.55 gene distance, the overall Table 3 Analysis of the influencing factors of HIV-1 entering the molecular transmission network of sexually transmitted people in Liaoning Province.  access rate was 60.6%. The higher the access rate, the higher the risk of transmission. The results showed that CRF01_AE access rate is 68.7%, CRF07_BC access rate was 62.5%. And CRF01_AE and CRF07_BC subtypes all formed propagation clusters with more than 2 connectivity degrees. The CRF01_AE subtype forms a large propagation cluster containing 40 nodes in the network. The CRF01_AE subtype mainly exists in homosexual communicators, accounting for 82.4%. This is consistent with Shanghong's report that two CRF01_AE strains were independently introduced into the men who have sex with men (MSM) population in Liaoning Province in the early and middle 1990 s and formed local epidemic clusters consistent. [20] By Shao Yiming team to cover the whole country CRF01_AE Near-fulllength sequence analysis was performed on 75 strains from the main endemic areas. Discovery of the CRF01_AE strain in China was introduced via Southeast Asian countries (Thailand) in the 1990 s. [21] The results show that all 15 nodes entering the network in heterosexual propagation are CRF01_AE subtype. Mainly due to the CRF01_AE subtype has been introduced early and has been prevalent for a long time. It has spread from the MSM population to the heterosexual and can form larger transmission clusters. The CRF07_BC subtype forms a propagation cluster consisting of 5 nodes, all of which are homosexual propagation. CRF07_ BC subtype virus was initially formed and mainly prevalent among drug users in China. [22] Previous research by Shang Hong's team suggested that a specific CRF07_BC lineage [23] exists in MSM, and it has a significant foundation in MSM populations across the country. B subtype formed two 1-degree connected clusters composed of MSM populations. Among the earliest reported MSM population in China, B subtype was the main epidemic strain, and some recombinant subtypes containing B subtype components appeared in the later stage. [24,25] In this study, although the B +C recombinant subtype did not enter the network, it also accounted for 4.6%. Concern should also be paid to subtype B and recombinant subtypes. The results of this study also show that CRF01_AE, CRF07_BC and B, The three subtypes are not interconnected and cluster with each other. It indicates that the genetic distance among the three is greater than the optimal genetic distance, and the transmission risk among each other is less than that within each subtype. The CRF01_AE subtype propagation cluster ratio of CRF07_BC and B have more propagation clusters. This indicates that there are many mutations in CRF01_AE subtype, forming multiple sources and thus forming multiple transmission clusters. CRF07_BC formed only one propagation cluster and the degree of association was ≥2. Therefore, CRF07_BC subtypes are monitored dynamically over a long period of time to enable precise prevention and control measures to block their transmission.
The results showed that the factors affecting HIV-1 entry into the molecular transmission network were occupation, sample source, genotype and drug resistance. Among them, workers, testing consultation, other patient testing, subtype B and drug resistance were higher than other groups. The main sexual transmission into molecular transmission network in Liaoning Province is homosexual transmission. Risk factors for homosexual HIV infection include occupation and sample source, as reported by Shang Hong's team. [26] The results showed that only 3 primary resistance mutations entered the molecular transmission network. Two of them were CRF01_AE subtype, resistance gene is protease inhibitor-related resistance, mutation site is L33F, Q58E. The L33F mutation constitutes interconnection in the largest spreading cluster in the network. Mutations K65R, Y181C+G190S, which produced high drug resistance, did not enter the transmission network.
The epidemiological characteristics of HIV-infected individuals within each molecular transmission cluster showed that 97% of the subjects were from different urban areas in the same city. The traffic between different urban areas is convenient. It has certain spatial aggregation. Studies have shown that homosexual and heterosexual transmission are positively correlated with spatial aggregation. [27] The genetic distance of each molecular transmission cluster also indicates that the infected strains have high homology.

Conclusion
In short, molecular communication networks can assess the risk of transmission through the network access rate and the degree of association. And can be based on the construction of molecular network to early warning of communication risk. [28] Network analysis can provide more accurate basis for prevention and control. It is suggested that the network analysis of molecular methods should be combined with social science and public health methods. Precise intervention should be carried out for high-risk groups to curb the spread of HIV.
This study has some limitations. A molecular transmission cluster only represents a group of highly associated infected individuals, and does not reflect the real transmission relationship. The retrospective analysis used in this study shows that the sample size is limited. The conclusion can only be used to observe the publicity, intervention and testing effect of antiviral treatment and voluntary counseling and testing in the past. On this basis, molecular network detection should be carried out to achieve accurate prevention and control of HIV.