Molecular Epidemiology of Hepatitis B virus (HBV) Gnotypes Prevalent in KP

Hepatitis B virus (HBV) infection is a major public health dilemma affecting about 2 billion of world population and more than 350 million people are chronically HBV carriers including Pakistan with an estimated prevalence rate of 3%. HBV can be categorized into 10 genotypes (A-H) clarified by more than 8% of sequence divergence based on the whole genome. Although Pakistan is highly endemic to HBV no large scale study of HBV genotypes based on sequence analysis has been reported yet so the ongoing research study was aimed to explore the existing patterns of HBV genotypes based on sequencing method and phylogenetic analysis of HBV S gene distributed in Khyber Pakhtunkhwa (KP)which is one of the third most populated province of Pakistan. A total of 3000 chronically HBV positive samples were collected from 7 most populous districts of KP and were analyzed by ICT followed by qualitative PCR for confirmation. Type-specific PCR or restriction fragment length polymorphism (RFLP) and random sequencing of the partial S gene were carried out for HBV genotypes characterization. We obtained a 100 of, S gene nucleotide sequences out of which 28 nucleotide sequences demonstrating the whole diversity of the sequenced types were further used for phylogenetic study using Mega 6 software. Active infection of HBV was confirmed in all patients through qualitative PCR and three genotypes A, C, and D were confirmed by type specific PCR and RFLP. The most prevalent genotype detected was genotype D 68.3% followed by genotype A 22.6% and genotype C 8.53%. Phylogenetic analysis of the obtained sequences based on HBV S gene revealed that some of our HBV sequences clustered with some local isolates showing close homology with them while other clustered together with some foreign isolates with a high bootstrap value. However, one isolate didn’t match or with any of HBV strain available in online repositories that point towards a great divergence and a distinctive origin of the strain.


Hepatitis B virus (HBV) infection
remains a critical health problem with significant rate of morbidity and mortality all over the world. Approximately 2 billion people have been infected with the virus and among them more than 350 million people are chronically infected (Schweitzer et al., 2015). Cirrhosis and hepatocellular carcinoma (HCC) are the major complications ascends due to HBV causes 1 to 2 million fatalities annually and is the 10 th leading cause of mortality around the globe (Yoo et al., 2018;Rehermann et al., 2005). Nine million people of Pakistan are infected with HBV and 0.27 million of population are believed to be chronically carriers with an overall prevalence rate of 5% in general population and around 20% in population at high risk (Mehr et al., 2012;. In Khyber Pakhtunkhwa (KP) hepatitis B is considered as one of the top risk of liver infections with an average prevalence rate of 2.70% .
Hepatitis B virus was first identified in 1965 and after thorough research its molecular virology is currently quiet clear and well-understood. It belongs to the family Hepadnaviridae and possess a partially circular double stranded DNA genome of 3.2 kb in length, that composed of four major open reading frames (ORF) encoding the envelope (preS1, S2 and HBsAg), polymerase and X proteins, respectively (Magnius et al., 1997;Stuyver et al., 2000). On the basis of genetic differences HBV has been ordered into 10 different genotypes (A-J) established on the sequence variance of more than 8% on whole genome) and 40 different sub genotypes (> 4% of sequence divergence) (Pourkarim et al., 2014). These genotypes have distinct geographical distribution as genotype A is scattered worldwide and is the major genotype mostly found in Europe, Africa, America and India. Genotype B and C are prevalent in East and Southeast of Asia (Mahtab et al., 2008). Genotype D has been reported globally and mostly found in Middle East and Mediterranean countries while genotype E exists in western-sub-Saharan Africa (Mulders et al., 2004;Kramvis et al., 2005). Genotype F frequently survives in America and Polynesia and genotype G is linked to North America and Europe while the most recently accepted HBV genotype H has been listed from US (Mexico) (Kramvis et al., 2005). A freshly appeared genotype I has been recommended by scientists to define as a rare strain that was commenced from Hanoi, Loas and Northern Vietnam and recently has been reported from northwest areas of China and northeast region of India (Huy et al., 2008) and J from Japan has been reported (Kao., 2011; Tanwar and Dusheiko., 2012). As HBV genotypes have a crucial role in progression of hepatic disease, transmission routes, treatment efficacy and clinical outcome (Tanwar and Dusheiko., 2012; Tanaka and Mizakomi 2007) so identification of specific HBV genotype and gaining information regarding divergence of HBV sequences will be quiet useful in improving the clinical practices, diagnosis and results.
Khyber Pakhtunkhwa (KP) is located in the Northwest of Pakistan and is the third most populous province of Pakistan. A recent report showed that KP has a high prevalence rate 3% of HBV infection with a 30.25 million of population and has more than 10 million of chronic HBV carriers . Despite of such a threatening situation, HBV genotypes and its genome sequences have not been sufficiently evaluated in KP and only scanty data is available. So the current study was planned with the aim to examine the existing patterns of HBV genotypes and its genetic variations among chronically HBV infected patients from KP province of Pakistan.

sample collection
A total of 3000 blood samples were collected from chronically HBV positive patients. The research plan was approved by ethical board of Abdul Wali Khan University Mardan (AWKUM) and informed consensuses were acquired during the sampling from the patients or their caretakers. Blood samples were collected from the chronically positive patients admitted at different healthcare units including Khyber Teaching Hospital Peshawar (KTH), Lady Reading Hospital Peshawar (LRH), Hayatabad Medical Complex (HMC), Mardan Medical Complex (MMC), Federal Genomics lab Islamabad, Biotech labs Rawalpindi in the study area. 5ml of blood was collected from chronically HBV positive patients in a sterile syringe and serum was extracted and stored at -20 until extraction of DNA was performed.

Immunochromatography (ICt)
All serum samples were firstly analyzed for the presence of anti-HBsAg by immuno chromatography (Abbott Laboratories, USA). dNA extraction DNA was extracted by using promega Maxwell® HT Viral Total Nucleic Acid Kit, Custom (Cat# AX2340) from each sample according to the manufacturer's protocol.

Qualitative screening of hBV dNA and type specific PCR
For detection of active HBV infection all samples including in the study were analyzed through qualitative PCR for confirmation. Viral qualitative detection was performed by carrying out two step PCR reactions as described earlier (Norder et al., 2004) which is sensitive as well as accurate practice as compared to other serological methods. Positive and negative controls were used for performing each PCR reaction. hBV Genotyping All samples were processed either through Type specific PCR or RFLP to distinguish different strains of HBV genotypes. Type specific PCR was done for the screening of HBV genotypes A-H as previously reported (Farazmandfar et al., 2012). In case of RFLP based genotyping process two enzymes AfaI (gt/ac) and Hinf I (g/antc) were confirmed to distinguish between all 4 major HBV genotypes. Nucleotide sequences of HBV genotypes A-D were retrieved from online databases of DNA (NCBI / Gene Bank/ DDBJ). By using primer3 tool available online two sets of primers were designed and their cross complementarity was checked. A highly conserved region of S gene was targeted for primers designing. Acomparatively larger 320bp fragment of viral genome was amplified by first set of primer while a smaller fragment of 230bp was amplified by second set of primer.
Restriction digestion of confirmed HBV PCR products was carried out to identify the HBV genotypes using RFLPs of S-Gene. The choice ofsuitable restriction enzyme was achieved by using NEB cutter tool that is accessible online and restriction digestion by using nebcloner tool. 100µl of DNA was added with 1/10µl of 5M NaCl and double volume of ethanol and kept for half an hour on ice and then centrifuged at 14000 rpm for 10 min and supernatant was detached. Pellet was washed by using 70% ethanol and re-dissolved in 15µl of nuclease free water and kept at 4°C up to till used. Restriction digestion of HBV amplified product was done by using two restriction enzymes Hinf1 & Afa1. Reaction mixture up to 30µl was set in a PCR tube containing amplified PCR product of 10µl, Res. Enzymes (10 U/µl) 1µl, reaction buffer 5µl, Nuclease free water 14µl and incubated at 37°C for 4 hrs. PCR-RFLP products pre-stained with ethedium bromide were run on 4 % agarose gel and the restriction fragments were observed under ultraviolet light.

Amplification of HBV S gene
100 geographically representative samples were selected randomly in our study for sequencing. Amplification of pre-S1 region of HBV genome and pre-S2 region was performed by hemi nested PCR (Operating BioER XP Cycler, China) reaction by using two sense primers(HBVS1, HBVS3) and antisense primer (HBVS2) including the whole region of gene correspondingly, with 658bp expected amplicon size (Naito et al., 2001).

Purification of PCR product and sequencing
PCR products obtained through second round PCR were purified by PCR Purification Mini Kit (WizPrep™Gel) according to manufacturer's protocol. Finally amplified PCR products were exposed to direct sequencing by an automated genetic analyzer ABI PRISM 3130.

Evaluation of sequence similarity
All the curetted sequences achieved were aligned so as to determine their similarities and consensus sequences were derived. In 100 a total of 28 sequences of HBV signifying the entire genetic variation of the strains in different areas of KP were gaged for sequence similarity by means of CLC Main workbench. Alignment of whole S gene sequences was carried out by means of NCBI nblast tool to validate the genetic resemblance of our selected sequences of HBV (query)with the already stated (subject) sequences.

Aminoacids/Polypeptides and peptide BLAST
Translation of entire nucleotides sequences into amino acids sequences was carried out through online available software EMBOSS Transeq and the sequences were saved in FASTA format. All the obtained peptide sequences were blast by using NCBI-Pblast which show smaches with several reported and selected sequences of Hepatitis B Virus.

Phylogenetic analysis
HBV S gene sequences reported from different parts of the world were obtained and retrieved from online available repositories and were used for building of phylogenetic tree by using MEGA 6 tool (Saitou and Nei 1987).

statistical Analysis
Statistical analysis of the data was done by using SPSS 20.0.

RESuLTS
Over all 2850 samples were detected positive for HBsAg by ICT and HBV DNA was detected among all the 3000 active HBV positive patients' through qualitative PCR. Out of total 3000 patients, 56.3% were male and 43.7% were female of age ranges between 10-85 years. All positive samples were further subjected to diagnostic type specific PCR and RFLP procedure to recognize different types of HBV genotypes mingling in KP. Three genotypes of HBV were successfully confirmed in our study with the most prevalent genotype was D (68.83%) followed by genotype A and C, (22.63%, 8.53%). For confirmation of HBV genotypes a total of 100 geographically representative samples were selected randomly for S gene amplification and sequencing. All the HBV strains sequenced in ongoing study along with reference sequences reported from 12 different countries of the world were retrieved from online DNA repositories and were used for further analysis. Only 28 geographically representative and genetically distinct sequences were considered for phylogenetic analysis. Phylogenetic data directed that some of the HBV isolates KP2 and KP3 were strictly linked to earlier stated Pakistani isolates NCVI-JN1321132.1 and NCVI-JN132120.1 while KP4, KP5, KP6 and KP25 grouped with some other Pakistani previously reported HBV isolate EF584653.1 also having high bootstrap value. While HBV KP26 and KP28 clustered together with HBV isolates of Hungary, Canada, Argentina and Syria in a separate clad (Fig. 1). In the same way, one more HBV isolate KP27 did not cluster with Pakistani or any other foreign HBV isolates showing a very rare and distinct origin of the strain. As a whole, the phylogenetic data obtained indicates towards a great deal of diversity with some closer to the previously reported local isolates while others are genetically distinct among the HBV isolates sequenced in this study.
All 28S gene nucleotide sequences were translated into their respective polypeptide and were blast. Several protein display matches with other peptides of HBV stated from foreign countries. All the S protein sequences of HBV were aligned and matched along with a reference local Pakistani HBV isolate named NCVI-JN132120.1. Firstly, the amino acid sequences alignment indicated that some of the strains were strictly associated to the reference sequence while some other HBV strains HBV KP7, KP26, KP27 and KP28 revealed a greater and wide range diversity with respect of its reference sequence. An insertion mutation 'I' was not found in any of the KP isolates at position no 3, though a number of other mutations were seen in the isolates.

DiSCuSSion
HBV is one of the major health threats across the globe especially in emerging countries as well as in Pakistan. As different strains of HBV genotypes have different impact on progression and development of liver diseases, transmission routes, treatment response and clinical outcomes including cirrhosis and hepatocellular carcinoma (HCC) thus accurate information regarding specific HBV genotypes is needed to tackle these problems ( Kao, 2002). Distribution of HBV genotypes varies accordingly different regions and geographical conditions so their epidemiological features need to be explored frequently thus to monitor control strategies of the disease and to figure out the transmission and dissemination patterns of the virus. So far HBV has been categorized into 10 different genotypes (A-J) that are distributed worldwide (Cao et al., 2009) and identification of specific genotype is important to explain the route and pathogenicity of the strain. Currently quite a lot of techniques have been practiced for the detection of HBV genotypes comprising direct sequencing, PCR with type specific primers, RFLP, enzyme linked immunoassay and line probe assay among them sequencing is even still considered as the ideal and role model method, but it is costly as compared to other methods (Naito et al., 2001;Grandjacques et al., 2000;Usada et al., 1999). Presently methods PCR with genotype specific primers and RFLP are most widely used for the detection of HBV genotypes as both are simple, inexpensive and highly sensitive methods.
In this study we have successfully processed and genotyped all positive HBV samples obtained from patients by performing genotype specific PCR and RFLP, with genotype D 68.83% found as the most predominant followed by HBV  Baig et al., 2008, also stated that Genotype D is the most repeatedly type found 64% followed by A in 23% (Baig et al., 2008). Very similar and matching results were also presented by Hanif et al., 2013, that reported genotype D as the leading 58% followed by combinationof two genotypes A and D 31% and genotype A was only 10% recorded. In our present study the least circulating genotype observed was genotype C 8.53% that is in contrast with a study reported by Awan et al., 2010 in which they claimed HBV genotype C as the most prevalent genotype in our country (Awan et al., 2010). In south East Asia, the most prevalent genotypes are B and C, this is may be due to that majority of studies were reported from China and Japan where two genotypes of HBV, B and C are the most occurring genotypes (Chan et al., 2003) but from India three mixed strains of HBV (A, C and D) have been reported by scientists (Kumar et al., 2005). HBV genotype C is generally occurs in China and India whereas HBV genotype D is prevailing in Iran, Afghanistan and India which share same border line with Pakistan (Norder et al., 2004;Khan et al., 2008). In our study 3 genotypes of HBV A, C, and D were confirmed while other five genotypes of HBV B, E, F, G and H were not showed in any sample revealing absence of these strains in this region of sub-continent.
Characterization of sequences possess crucial role in classifying the genome and distinguish different strains of HBV from one another mainly based on nucleotide substitutions, deletions or insertions. Such information is important in investigating the viral ancestries, patterns distribution and is also beneficial in controlling of disease in the better way. This is the very first in-depth detailed study that revealing the genotypic and evolutionary analysis of the most frequently occur HBV strains based onS gene region of HBV genome in KP and also its comparison with other foreign isolates reported from different parts of the world.In this study the most prevalent genotype confirmed was D followed by genotype A and C. Earlier, numerous studies have been conducted which described that genotype D is the most prevalent genotype A quiet similar distribution of HBV genotype D (62.5%) and A (8.9%) to the one in our study was found by Alam et al 2007 (Alam et al., 2007b). Whereas a study conducted by Awan et al 2010, claimed that HBV genotype C is the most abundant type of HBV prevalent in KPK that is 38.8%. It has been clarify by phylogenetic analysis that some of our isolates clustered closer to previously reported Pakistani HBV isolates specifying the presence of alike patterns of strains in other provinces of Pakistan ( fig.1). Yet, two isolates of HBV clustered together closely with HBV isolates of Hungary, Canada, Syria and Argentina in a separate clad showing close similarity with them. Also, another isolate KP27 of HBV did not cluster with any Pakistani or foreign HBV isolates showing a distinct and unique origin of the strain. As a whole, the phylogenetic analysis points towards a great diversity among the HBV isolates sequenced in this study with some closer to the previously reported local isolates while others are genetically distinct.
The only limitation of the current study is that it doesn't involve the role of the genotypes and their impact on the progression of liver disease so large scale studies are required to obtain further information.

ConCLuSion
It has been concluded from our study that genotype D is the most prevalent HBV genotype followed by genotype A and C circulating in this region and phylogenetic analysis based on S gene sequences revealed that there is a close homology of some of our sequences to the local as well as foreign isolates.As HBV genotype D shows a poor response to interferon antiviral therapy and genotype C causes a severe liver disease as compare to other genotypes so genotyping of samples should be made mandatory by practitioners before starting proper treatment to the patients.

declaration of Competing Interest
The authors declare no conflict of interest.

Funding source
There is no funding source.