Genetic diversity of hepatitis B virus in Yunnan , China : identification of novel subgenotype C 17 , an intergenotypic B / I recombinant , and B / C recombinants

Yunnan is considered to be a geographical hotspot for the introduction, mutation and recombination of several viruses in China. However, there are limited data regarding the genotypic profiles of hepatitis B virus (HBV) in this region. In this study, we characterized 206 HBV strains isolated from chronic hepatitis B patients in Yunnan, China. Initial genotyping based on 1.5 kb sequences revealed that genotype C was the most prevalent at 52.4% (108/206), followed by genotype B at 30.6% (63/206) and unclassified genotypes at 17.0% (35/206). To characterize the 35 unclassified strains, 32 complete HBV genomes were amplified and analysed; 17 isolates were classified within a known subgenotype, 8 were classified as B/C recombinants, 1 was classified as a B/I recombinant and 6 constituted a potentially novel C subgenotype that we designated as C17, based on the characteristics of a monophyletic cluster, >4%genetic distances, no significant evidence of recombination and no epidemiological link among individuals. Thus, multiple subgenotypes – namely B1, B2, B4, C1, C2, C3, C4, C8 and C17 – and two distinct intergenotypic recombinants exist in Yunnan, China, highlighting the complex and diverse distribution pattern of HBV genotypic profiles.


INTRODUCTION
Hepatitis B virus (HBV) infections impact negatively on human health and are a significant healthcare burden worldwide.Chronic HBV infection, which affects approximately 257 million people worldwide [1], is associated with liver damage leading to liver cirrhosis, hepatic decompensation and hepatocellular carcinoma (HCC) [2].HBV is an enveloped double-stranded (ds) DNA virus that has a relaxed circular DNA genome that is approximately 3200 nucleotides (nt) long and encodes four overlapping open reading frames encoding a polymerase (P), a surface protein (S), a precore/core protein (C) and the X gene product (X) [3,4].HBV exhibits high genetic diversity, since the polymerase lacks proofreading activity [5].
In China, nearly 80 million people are chronic carriers, and about 300 000 die from liver cirrhosis and HCC caused by chronic HBV infection each year [14,15].Many studies have revealed that different HBV genotypes/subgenotypes affect clinical outcome, disease prognosis and response to antiviral treatment in infected patients [16,17].HBV genotype B is predominant in southern China, while genotype C is more common in the northern and central areas of the country [18].Yunnan is located in the most southwestern province of China and shares borders with Myanmar in the west, Laos in the south and Vietnam in the southeast.Several previous reports showed that genotype C is the most abundant, followed by genotypes B, D and I [12,19]; among genotype B and C isolates, subgenotypes B2 and C2 are predominant in Yunnan, China [20,21].However, there is still limited information regarding the diversity and circulation of HBV genotypes and subgenotypes in Yunnan.This was investigated in the present study by examining 206 patients with chronic hepatitis B in Yunnan, China.

Phylogenetic and recombinant analyses based on full-length HBV genome sequences
To further characterize the 35 unclassified HBV strains, 32 full-length HBV genome sequences were amplified, sequenced, and analysed.The phylogenetic analysis revealed that 16 HBV strains belonged to 5 distinct subgenotypes; 10 were classified within subgenotype C8, 2 each were classified within C4 and C1, 1 was classified within C3 and 1 was classified within C2 (Fig. 3a).The 16 unclassified strains were evaluated for potential recombination events using the jpHMM and SimPlot programs.Nine were identified as potential recombinants.Two of the isolates (YNKM40H and YNKM19H) harboured a recombined form of genotypes B and C (Fig. 3b, c), with a B fragment inserted into the X region (nt 1531-1781) of the C backbone (Fig. 4).Six isolates were identified as genotype B/C recombinants (Fig. 3d-i) with different recombinant fragment lengths and breakpoints (Fig. 4).Notably, strain YNKM91 was classified as a genotype I/B recombinant (Fig. 3j), with a B fragment inserted into the preS1 region (nt 1-708) of the I backbone (Fig. 4).
Of the remaining seven strains that showed no evidence of recombination, strain YNKM21 was phylogenetically distant from the reported HBV C subgenotypes (Figs 3k and 5a), but was closely related to subgenotype C2, with 95.3 % nucleotide identity, indicating a quasi-subgenotype C2.

Identification and characterization of a novel HBV subgenotype
To confirm that the six strains belong to a novel C subgenotype, their complete genome sequences were analysed phylogenetically, along with HBV reference sequences of representative subgenotypes C1-C16.The results showed that the strains formed a distinct monophyletic cluster supported by a high bootstrap value of 100 % and were distantly related to all known HBV C subgenotypes (Fig. 5b).We therefore tentatively designated these strains as constituting the novel subgenotype C17.
The intergroup nucleotide divergence (mean±SD)% over the full-length genome sequences of C17 was compared to that of representative subgenotypes (C1-C16) (Fig. 5b).The result revealed that C17 were different from known HBV subgenotypes of C1-C16 by 4.1-6.1 %.An amino acid sequence comparison between HBV C17 isolates and other subgenotypes (C1-C16) indicated that C17 had 17 unique amino acid motifs within the P gene and preS1/S2 and X regions (Fig. 5c).

DISCUSSION
In the present study, we evaluated the HBV genotype, subgenotype and recombinant distributions among 206 patients with chronic hepatitis B in Yunnan, China.Our results showed that genotype C is the predominant genotype (63.1 %) in Yunnan followed by B (30.6 %), which is consistent with previous findings (76.9 and 15.4 %, respectively, in Yunnan [22] and 62.5 and 33.3 %, respectively, in Kunming [19]).We identified multiple subgenotypes (B1, B2, B4, C1, C2, C3, C4, C8 and C17), two intergenotypic recombinants (B/C and B/I) and a quasi-subgenotype (C2).Our results demonstrate that Yunnan has a high degree of HBV genotype diversity and complexity, posing a challenge for antiretroviral therapy to control HBV infection.
The C1 subgenotype of HBV was predominant in subjects from the Dai ethnic group from Dehong and was more prevalent there than in Kunming and Chuxiong.This may be explained by the geographical location of the Dehong prefecture, which shares a border with Myanmar, where there is a very high prevalence of the HBV C1 subgenotype [22].It has also been reported that HBV C1 is only found in Southeast Asian countries, including Vietnam, Myanmar and Thailand [23].We therefore speculate that the HBV subgenotype C1 circulating in Dehong originated in Myanmar.It is also worth noting that a high genetic diversity of HBV, including eight HBV subgenotypes and two different recombinant forms, was observed among individuals from the Han ethnic group in Kunming harbouring HIV-1.This may be due to the fact that Kunming is a provincial capital city and also a centre of tourism and trade with other provinces in China and Southeast Asia.The frequent contact among people from different regions may have resulted in multiple introductions of alien HBV strains, which then mixed with indigenous ones.This is supported by our previous finding that the clusters of subgenotypes B2 and C2 comprise strains from China and other East Asian countries, while subgenotypes C1, C5 and C7 and genotype I form a cluster with strains from Southeast Asia [19].
The most important finding of this study was the identification of a novel HBV subgenotype.The characteristics of a highly supported monophyletic cluster, >4 % genetic distance, no significant evidence of recombination and unique amino acid signatures, and the fact that the six strains in question were isolated from individuals with no epidemiological link, are consistent with the standard classification of a novel HBV subgenotype [24,25], which we designated as C17.In terms of nucleotide sequence similarity and phylogeny, C17 was most closely related to HBV subgenotype C8, suggesting that they originate from the same HBV lineage.
Intergenotype recombination plays a key role in HBV emergence and genetic variability and presents enormous challenges for vaccine design and antiviral therapy strategies [26].We found a relatively low frequency of intergenotypic recombinant HBV (4.4 %), of which most strains were genotype B/C recombinants.Notably, these isolates showed a variety of recombination breakpoints, including at nt 400-667, 770-1458, 1500-1800 and 2771-3200 (Fig. 4), suggesting that there is no hotspot for recombination.Another important finding of our study is the first observation of a B/I intergenotype HBV recombinant.Previous studies have shown that intergenotypes exhibit circulation patterns that are similar to those of their original genotypes [13]; for instance, B/C recombinants are limited to South and East Asia and are most prevalent in regions where HBV genotypes B and C are widespread, including China, Indonesia, Japan, Malaysia, Philippines, Taiwan, Thailand and Vietnam [27,28].It is therefore tempting to speculate that the B/I recombinant arose from the abundance of genotypes B and I circulating in Yunnan.In fact, while Yunnan has a high prevalence of genotype B, genotype I is uncommon.Relatively few strains causing HBV genotype I infection have been reported in Yunnan [12,19].We did not observe any genotype I strains in the present study, possibly due to the small study population.HBV intergenotype recombination is easily detected on the basis of phylogenetic, simplot and bootscanning analyses.However, mixed infection by two different HBV genotypes or subgenotypes can result in an erroneous conclusion that recombination has occurred.This may be attributed to the fact that recombination analysis performed on PCR products can be biased by template switching in the first rounds of PCR amplification, or the splicing of the full-length sequences using subgenomic PCR.To eliminate this, it is necessary to select a genotype-independent PCR amplification strategy and obtain complete HBV genome sequences from a unique PCR fragment [13].In this study, to demonstrate that the strain YNKM91 is a B/I intergenotype HBV recombinant, we obtained three full-length genomes from a unique PCR fragment and repeated the PCR three times on separate occasions (MH746811-MH746813).Our analyses of nucleotide similarities and phylogenetic (Fig. S1a), recombinant (Fig. S1b) and subregion trees (Fig. S2) confirmed that the strain YNKM91 really belongs to a B/I recombinant, not an artifact.
Our study documents the complex and diverse distribution patterns of HBV subgenotypes and intergenotype recombinants in Yunnan, China.We identified a novel HBV subgenotype (C17) and characterized a previously unreported B/I intergenotype recombinant.These findings provide an insight into the genetic characteristics of HBV strains in Yunnan, which can be used to develop effective measures to prevent the transmission of HBV in this region.

Study population
Plasma samples were collected between June 2016 and September 2017 from 206 chronic hepatitis B patients from 3 ethnic groups (Han, Yi and Dai) residing in Kunming, Chuxiong and Dehong prefectures, respectively, in Yunnan, China.Kunming is the capital and largest city of Yunnan province, southwest China and is a multi-ethnic city; the majority of the population are Han and Yi, followed by Hui, Bai, Miao, Hani, Zhuang, Dai, Lisu and other ethnicities.Chuxiong is located in the central part of Yunnan Province and abuts Kunming to the east.The residents are mainly made up of the Yi ethnic minority (Yi autonomous prefecture).Dehong is situated at the western boundary of Yunnan province and shares a boundary with Myanmar; 51.83 % of the population comprise minority ethnicities, mostly Dai and Jingpo.
The retrieval location and quantity of the samples collected for this study are shown in Fig. 1.The samples met the following inclusion criteria: (1) hepatitis B surface antigenpositive for 6 months with normal serum alanine aminotransferase (ALT) levels; (2) subject was residing in Yunnan province and was over 18 years old; (3) complete demographic information and clinical data were available for subject; (4) subject consented to the use of patient information in studies on HBV epidemics; and (5) subject was treatment-naïve during sampling.Demographic information and data on risk factors for HBV transmission were collected through a questionnaire.Standard clinical biochemical tests for hepatic and renal function and blood glucose and fat were performed at the time of sampling.The remaining plasma samples were aliquoted and stored at À80 C. Demographic information relating to age, gender and ethnicity was obtained via self-report questionnaires.The clinical parameters of disease progression, including ALT and aspartate aminotransferase (AST) levels and HBV DNA load, were determined at the time of sampling.Plasma was separated from whole blood samples using EDTA tripotassium salt and stored at À80 C for HBV DNA extraction.The study protocol was approved by the First People's Hospital of Yunnan Province Ethics Committee (YNFPH-2016-012).Written informed consent was obtained from all participants.
DNA extraction and HBV gene amplification HBV DNA was isolated from 200 µl plasma using the Axy-Prep Blood Genomic DNA Miniprep kit (Axygen Biosciences, Union City, CA, USA) according to the procedure described in the manual.To determine the HBV genotype and subgenotype, partial sequences of nearly 1.5 kb (GQ358158 : 55-1536) were amplified using the nested PCR strategy with two oligonucleotide primer pairs.HBVspF01 : 5-CTGCTGGTGGC TCCAGTTCAGGRAYAGT-3 and HBVspR01 : 5-GGCG TTCACGGTGGTYTCCATGCGACG-3 were used for the first round with the following cycling conditions: 94 C for 5 min; 94 C for 30 s, 55 C for 30 s, 72 C 2 min 30 s, 30 cycles; 72 C for 10 min.HBVspF02 : 5-GGRAYASTRAACCCTG TTCCRAMTAYT-3 and HBVspR02 : 5-AARTGCACACGG TCCGGCAGATGAGAAG-3 were used for the second round with the following cycling conditions: 94 C for 5 min; 94 C for 30 s, 58 C for 30 s, 72 C 2 min 30 s, 30 cycles; 72 C for 10 min.To confirm the novel subgenotype and recombinant, and to eliminate the possibility of mixed infection by two or more different HBV genotypes in particular, complete HBV sequences from a unique PCR fragment were amplified as previously described [29,30].Amplified PCR products were detected by electrophoresis on an 1.0 % agarose gel under ultraviolet illumination and purified using a DNA purification kit (Tiangen, Beijing, People' s Republic of China).

Phylogenetic analysis
The sequencing data were initially verified using the National Center for Biotechnology Information's Basic Local Alignment Search Tool (BLAST).Reference sequences in GenBank were used for a comparative analysis of all HBV genomic sequences.Multiple alignments of the selected sequences were performed using Clustal v.1.8.1 software.The generated data were processed using BioEdit v.7.1.5software.Phylogenetic trees were constructed based on the obtained datasets with the maximum-likelihood method using MEGA v.6.0.6 and the general time-reversible plus gamma distribution plus invariant sites (GTR+t+I) model.Bootstrap values were calculated based on 1000 replications of the alignment.

Recombination analysis
Potential HBV genetic recombination events were analysed using the jumping profile hidden Markov model (jpHMM) included in the Los Alamos human immunodeficiency virus (HIV) database, and were further validated with SimPlot v.3.5.1 software to generate similarity plots and perform bootscanning and informative site analyses.The structure of the new HBV recombinant forms was elucidated based on information on breakpoints obtained from these analyses.

Fig. 1 .
Fig. 1.Maps of the study region and geographical distribution of subjects from all three prefectures of the Yunnan province of southwestern China.The Yunnan province of southwestern China is marked in dark grey, and Kunming city prefecture, Chuxiong prefecture and Dehong prefecture are highlighted in light grey.MYA, Myanmar; THA, Thailand; LAO, Laos; VIE, Vietnam.

Fig. 2 .
Fig. 2. Phylogenetic analysis of the about 1.5 kb partial sequences (GQ358158 : 55-1536) of HBV.Phylogenetic analysis was performed by the maximum-likelihood method, based on the GTR+G+I substitution model, with 1000 bootstrap replicates using the software MEGA v6.Han ethnic patients from Kunming are indicated by dark blue triangles, Yi ethnic patients from Chuxiong are indicated by red triangles and Dai ethnic patients from Dehong are indicated by bright blue dots.(a) Circular tree including all HBV genotypes (A-I), subgenotypes B1-B9 and subgenotypes C1-C16.(b) Separate clade, including the subgenotypes B1, B2 and B4.(c) Separate clade including the subgenotype C1.(d) Separate clade including the subgenotype C2.(e) Separate clade including the unclassified subgenotypes and/or recombinants.

Fig. 3 .
Fig. 3. Phylogenetic and recombinant analyses based on full-length HBV genome sequences.(a) The known HBV genotype/subgenotype reference sequences from the previous report were used.Phylogenetic analysis was performed by the maximum-likelihood method, based on the GTR+G+I substitution model, with 1000 bootstrap replicates, using the software MEGA v6.(b)-(k) Bootscanning plots were constructed using Simplot 3.5.1 software based on 100 replicates with a 300 bp sliding window moving in steps of 50 bases.

Fig. 4 .
Fig. 4. Analysis of recombinant fragments and breakpoint positions for HBV intergenotypic recombinant sequences from nine strains.The sequences are colour coded by genotype: the blue column represents genotype B, the purple column represents genotype C and the green column represents genotype I.

Fig. 5 .
Fig. 5. Analysis of full-length genome sequences of novel HBV subgenotype C17.(a) Recombinant analysis was performed using Simplot 3.5.1 software based on 100 replicates with a 300 bp sliding window moving in steps of 50 bases.(b) Maximum-likelihood tree based on phylogenetic analysis of the C subgenotypes (C1-C17) and the estimated intergroup nucleotide divergence between C17 and the other known C subgenotypes (C1-C16).(c) Signature pattern analyses of the newly named HBV subgenotype, C17 based on the complete genome.

Fig. 6 .
Fig. 6.Comparison of HBV subgenotype distributions among three different ethnic prefectures in Yunnan, China.The different HBV subgenotypes are shown in different colours.