Analysis of HIV-1 Molecular Transmission Network in Sexually Transmitted Population in Liaoning, China

Background In recent years, with the development of molecular epidemiology, molecular transmission networks based on evolutionary theory and sequence analysis have been widely used in the study of HIV-1 transmission kinetics and precision intervention in high-risk populations ,especially HIV − 1 Long-term monitoring of drug resistance and real-time prevention interventions. To analyze the characteristics of HIV-1 molecular transmission network in Liaoning, China . Methods A study on HIV infected persons with STIs in Liaoning Province from 2003 to 2019. HIV-1 RNA was extracted, amplied and sequenced, and a phylogenetic tree was constructed to determine the subtype using the well matched pol gene region sequence. The gene distance between sequences was calculated, the threshold was determined, and the propagation network was constructed. Results 109 samples of pol gene region were obtained, The main subtype of HIV-1 was CRF01, followed by B, CRF07, BC, etc.12.8% of them were resistant to HIV.At the threshold of 0.55% gene distance, 60.6% of them entered the HIV-1 molecular transmission network. Workers, sample source VCT, other testing, subtype B and drug resistance are the factors inuencing the access to HIV-1 molecular transmission network.The subtype of CRF01 _ AE propagated in the molecular network to form six clusters. In the network, the difference of connection degree between different subtypes was statistically signicant . Conclusion Liaoning Province should build a molecular transmission network and carry out long-term monitoring, so as to take targeted measures to block the spread of HIV and achieve accurate prevention and control.


Background
In recent years, with the development of molecular epidemiology, molecular transmission networks based on evolutionary theory and sequence analysis have been widely used in the study of HIV-1 transmission kinetics and precision intervention in high-risk populations [1][2][3] ,especially HIV − 1 Long-term monitoring of drug resistance and real-time prevention interventions [4][5] .However, domestic research on HIV-1 molecular cluster-based transmission networks is still in its infancy, and only some regions have reported it [6][7][8] .Based on traditional epidemiological methods such as questionnaire surveys, peer tracking, and disease surveillance, the molecular transmission network analysis method provides a new method for the analysis of HIV-1 network characteristics at the population level [9] .The HIV-1 molecular transmission network can re ect the HIV-1 social transmission network to a certain extent according to the research population's access to the network, clustering characteristics,and interconnection characteristics in the network.The promotion and analysis of molecular cluster-based communication network analysis techniques and strategies are mainly to further assist traditional social communication network analysis [10] .This study selected the basis of molecular epidemiological analysis of sexually transmitted HIV-infected persons in Liaoning Province, supplemented by eld epidemiological data to reveal the distribution and transmission of HIV subtypes in Liaoning,China.

Nucleic acid extraction and ampli cation
The collected whole blood samples were counted for CD4 + T lymphocytes (CD4 cells for short) within 24 hours, and then plasma was separated using a Beckman Coulter centrifuge at 1500 rpm for 15 minutes, and the plasma was divided and stored at -70℃.Roche's TaqMan48 and the accompanying HIV viral load detection kit were used to determine the viral load in plasma. HIV-1 ribonucleic acid (RNA) was extracted from the plasma using QIAGEN's IAampRNA MiniKit reagent. In-house method was used to perform a nested reverse transcription uorescent quantitative polymerase chain reaction (RT-PCR) on a 1.2 kb fragment of the pol region gene (1-99 codons in the protease region and 1-250 codons in the reverse transcriptase region). ) Ampli cation.The ampli ed products were identi ed by agarose gel electrophoresis.The ampli ed positive products were sent to Beijing Bomad Gene Technology Co., Ltd. for puri cation and gene sequencing, and the sequencing primers refer to "HIV Drug Resistance Surveillance Strategy and Detection Technology" [11] .

Sequence Analysis
The measured sequence results were spliced and cleaned up using software Chromas 1.62, and corrected and multi-sequence aligned using BioEdit 7.0. The phylogenetic tree was constructed using the neighborjoining method in MEGA7.0, and the Kimura2 parameter model (Bootstrap = 1,000) was selected to construct the phylogenetic tree and determine the gene subtype. The check value was ≥ 70%. The aligned sequences were introduced into the Stanford University HIV resistance database (http: // hivdb. Stanford. Edu) for on-line analysis of resistance mutation sites.

Molecular propagation network analysis
The aligned sequence was imported into the MEGA7.0 software, and the Tamura-Nei93 model was used to calculate the gene distance between all pairs. By observing the total number of propagation clusters in the propagation network under a series of thresholds (0.14% − 1.92%), it is found that when the threshold value (i.e. the gene threshold when the number of propagation clusters is the largest and the sample size is the largest) is 0.55%, the total number of propagation clusters in the network reaches the peak value (9), so 0.55% was used as a threshold to build a molecular propagation network using Cytoscape 3.7.2 software to identify potential propagation partners. Degree is the degree of association, which indicates the number of nodes connected to other nodes in the molecular propagation network through edges. The network access rate is the percentage of the total number of people entering the molecular propagation network.

Statistical analysis
Statistical analysis was performed using SPSS19.0 and Excel2010. Count data are represented by frequency (composition ratio); χ 2 test is used to analyze the association between two categorical variables; single factor and multifactor logistic regression models are used to analyze the in uencing factors of molecular propagation network access rate. P < 0.05 indicates that the difference is statistically signi cant.

Basic characteristics of the research object
A total of 109 cases of sexually transmitted and successfully obtained Pol gene region sequences were screened. Among them, 102 were males (93.6%), the median age was 39 years (interquartile range, IQR, 29-52), 96 were Han (88.1%), 42 were junior high school or below (38.5%), and 53 were unmarried.
Whether to enter the HIV-1 molecular transmission network as the dependent variable, and take gender, age, ethnicity, education level, marital status, place of residence, occupation, route of infection, sample source, genotype and resistance as independent variables Factors and multi-factor Logistic regression analysis (Table 2), the results showed that workers, sample source testing consultation and other visitor testing, subtype B and drug resistance were in uencing factors for entering the HIV-1 molecular transmission network.

Characteristics of HIV-1 molecular transmission network
The CRF01_AE subtype propagates in the molecular network to form six propagation clusters, one of which is the largest propagation cluster in the network and contains 40 nodes. Subtype B forms two propagation clusters, each consisting of two nodes. The CRF07_BC subtype forms a propagation cluster consisting of 5 nodes. In the network, the median CRF01_AE subtype connectivity is 3 degrees ( IQR, 2-7), the median connectivity of subtype B is 1 degree ( IQR, 1-1), and the median connectivity of CRF07_BC subtype is 2 degrees ( IQR, 2-4), the difference in connectivity between different subtypes was statistically signi cant ( P < 0.01). HIV-1 resistance mutations entering the network are protease inhibitorrelated resistances, which include 3 nodes, and the mutation sites are L33F and Q58E. The homosexual transmission in the network accounted for 77.3%, and the degree of connectivity was 3 degrees ( IQR, 1-6); the heterosexual transmission accounted for 22.7%, and the degree of connectivity was 3 degrees ( IQR, 2-6). The place of residence is 83.3% in the local area and 16.7% in the eld, and the connectivity is 3 degrees ( IQR, 1-6).

Discussion
In this study, the genetic sequences and related data of 109 HIV-infected persons who were sexually transmitted during 2003-2019 in Liaoning Province were analyzed. CRF07_BC subtype (7.3%). In this study, a molecular propagation network was constructed based on the genetic distance between pairs of sequences calculated by the TN93 model. The Tn93 model simulates the conversion and transfer rates of nucleotides, and can correct deviations in HIV substitution and inconsistencies in base composition.
Calculate speed and biological authenticity [12] . In this study, we tried to determine the genetic distance threshold by identifying the transmission cluster among the sexually transmitted population in Liaoning Province. At a genetic distance threshold of 0.55%, the overall network access rate is 60.6%. The higher the network access rate, the higher the risk of transmission. The results show that the CRF01_AE network access rate is 68.7%, the CRF07_BC network access rate is 62.5%, and the CRF01_AE and CRF07_BC subtypes both form propagation clusters with more than 2 degrees of connectivity. The CRF01_AE subtype forms a large propagation cluster with 40 nodes in the network. The CRF01_AE subtype mainly exists in homosexual communicators, accounting for 82.4%. This is in line with the MSM population in Liaoning Province reported by the Shang Hong team. In the early and mid-1990s, two CRF01_AE strains were independently introduced and formed a local epidemic cluster. [13] The Shao Yiming team performed a nearly full-length sequence analysis on 75 strains covering the main endemic areas of CRF01_AE nationwide, and found that China The CRF01_AE strain was introduced in Southeast Asia (Thailand) in the 1990s [14] . The results show that the 15 nodes entering the network during heterosexual transmission are all CRF01_AE subtypes, mainly because the CRF01_AE subtype has an early introduction time and a long epidemic time, and has spread from the MSM population to the opposite sex, which can form larger transmission clusters. The CRF07_BC subtype forms a propagation cluster consisting of 5 nodes, all of which are homosexual propagation. The CRF07_BC subtype strain was initially formed and mainly prevalent in drug users in China [15] . Previous research by Shang Hong's team suggested that a speci c CRF07_BC lineage [16] exists in MSM, and it has a signi cant foundation in MSM populations across the country. B subtype formed two 1-degree connected clusters composed of MSM populations. Among the earliest reported MSM populations in China, subtype B was the main epidemic strain, and some recombinant subtypes containing B subtype components appeared later [17][18] In this study, although the B + C recombination subtype did not enter the network, it also had a 4.6% proportion. It is also necessary to pay attention to the B and C recombination subtypes. The results of this study also show that the three subtypes CRF01_AE, CR F07_BC, and B that enter the molecular transmission network do not have interconnections, and they are clustered with each other, indicating that the genetic distance between the three is greater than the optimal gene distance. The risk of transmission is less than the risk of transmission within each subtype. The CR F01_AE subtype has more propagation clusters than CR F07_BC and B, which indicates that there are more mutations in the CR F01_AE subtype, forming multiple sources and forming multiple propagation clusters. CR F07_BC only forms one transmission cluster and the degree of correlation is ≥ 2. Therefore, long-term dynamic monitoring of CR F08 _BC subtype should be performed to accurately formulate prevention and control measures to block its spread.
The results show that the factors affecting HIV-1's entry into the molecular transmission network are occupation, sample source, genotype and drug resistance. Among them, workers, testing counseling, other clinician testing, B type and drug resistance are higher than others. Group. Liaoning Province's sexual transmission into the molecular transmission network is mainly homosexual transmission. The risk factors for homosexual transmission of HIV include occupation and sample source. See the report of Shanghong's team [19] . The results show that the resistance genes entering the network are protease inhibitor-related resistances, and the mutation sites are L33F and Q58E. Two of them are L33F mutations that are connected to each other in the largest transmission cluster in the network.

Conclusion
In short, the molecular communication network can assess the transmission risk through the network access rate and the degree of correlation, and can provide early warning of the transmission risk based on the constructed molecular network [20] . Network analysis can provide a more accurate basis for prevention and treatment. It is recommended to give molecular methods for network analysis, combining social science and public health methods. Accurate interventions for high-risk groups to curb HIV transmission.
There are certain limitations in this study. A molecular transmission cluster only represents a group of highly-associated infections and cannot re ect the true transmission relationship. The retrospective analysis used in this study has a limited sample size. The conclusions can only observe the publicity, intervention, and testing effects of antiviral therapy and voluntary counseling and testing in the past.
Based on this, molecular network testing Achieve precise HIV prevention and control.

Abbreviations
HIV Human immunode ciency virus.

Declarations
Ethics approval and consent to participate The experimental protocol was established, according to the ethical guidelines of the Helsinki Declaration and was approved by the Human Ethics Committee of Liaoning Provincial Center for Disease Control and Prevention. Written informed consent was obtained from individual or guardian participants.

Consent for publication
Not applicable.

Availability of data and material
The research data come from China CDC's "AIDS Integrated Prevention and Control Information System", which needs to be downloaded using U shield.In order to protect patient privacy, the data used to support the ndings of this study are limited by state secrets.Only anonymous data that do not involve patient privacy can be provided.Data to support the ndings of this study can be obtained from the corresponding author upon request.

Figure 1
Phylogenetic tree of HIV-1 subtype adjacency method