Molecular characterization of Hepatitis C virus 3a in Peshawar

Background The purpose of this study was to explore molecular epidemiology of HCV genotype 3a in Peshawar based on sequencing and phylogenetic analysis of Core region of HCV genome. Methods Chronically infected Hepatitis C virus infected patients enrolled under the Prime Minister Hepatitis C control program at three Tertiary care units of Peshawar [Khyber Teaching Hospital Peshawar, Lady Reading Hospital Peshawar, Hayat Abad Medical Complex Peshawar] were included in this cross sectional observational study. Qualitative detection of HCV and HCV genotyping was carried out by a modified reverse transcription-polymerase chain reaction (RT-PCR) and type specific genotyping assay. The Core gene of HCV genotype 3a was amplified, cloned and sequenced. The sequences obtained were used for phylogenetic analysis using MEGA 6 software. Results Among the 422 (82.75 %) PCR positive samples, 192 (45.5 %) were identified as having HCV genotype 3a infection. HCV Core gene sequencing was carried out randomly for the characterization of HCV 3a. Nucleotide sequence analysis of the obtained viral genomic sequences based on partial HCV 3a Core gene sequences with reference sequences from different countries showed that our sequences clustered with some local and regional sequences with high bootstrap values. Conclusion HCV 3a is highly prevalent in Peshawar, Pakistan and its phylogenetics based on Core gene sequences indicate the prevalence of different lineages of HCV 3a in Peshawar which may have consequences for disease management strategies causing more economic pressure on the impoverished population due to possible antiviral resistance.


Background
Hepatitis C virus (HCV), first identified in 1989 is the primary cause of chronic liver diseases including cirrhosis (60-85 %) and hepatocellular carcinoma [1], with an annual mortality rate of 3.5-5.5 million due to complications of end stage liver diseases [2]. An estimated 185 million people (3 % of the world's population) are infected worldwide with relatively high prevalence rates in developing countries [3,4]. In Pakistan, approximately 10-17 million people are chronically infected with HCV [5,6] with an overall prevalence rate of 5 % [7]. Frequency of HCV infection differs among the four provinces with an estimated 1.1 million people infected in KPK [8,9].
HCV is an enveloped, positive sense ssRNA genome [10] in the family Flaviviridae consisting of an oral reading frame (ORF) approximately of 9.6 kb in length [11]. HCV genome shows substantial nucleotide sequence variability in both the structural and nonstructural coding regions, with different isolates of HCV showing as much as 30 % nucleotide sequence divergence over the entire genome which is sufficient to alter the antigenic and biological characteristics of seven major genotypes of HCV [12]. These genotypes have distinct geographical distributions. Although HCV genotypes 1, 2 and 3 appear to have a worldwide distribution, their relative prevalence varies from one geographic area to another [13]. Identification of HCV genotype is extremely important as different genotypes are relevant to the epidemiology and clinical management of chronic HCV infection and is the strongest predictive parameter for sustained virological response [14].
The primary means to identify and classify new genotypes is either by phylogenetic analysis of sequences or by measures of pairwise sequence similarity [15]. Although it is generally true that longer sequences are more informative for classification, it is usually possible to identify genotypes by sequence comparison of relatively short sub genomic regions of HCV [16]. While region encoding the 5′ non-coding region (5′ NCR) is too conserved for HCV subtypes identification, sequencing and phylogenetic analysis of nucleotide sequences amplified in the region of the genome encoding the Core protein is frequently used for classification of HCV into genotypes and subtypes [17][18][19]. HCV Core gene has sufficient genetic diversity and can produce topologically identical trees to those obtained upon analysis of complete genome sequences [20].
Molecular epidemiological studies previously conducted in Peshawar, Pakistan have indicated that the most prevalent HCV subtype is 3a accounting for 70-90 % of HCV infections based solely on type specific PCR based genotyping methods developed 15 to 20 years ago [21,22]. At present, there is little information about the genetic history and evolution of HCV genotype 3a in Peshawar which needs to be investigated for several reasons including increasing antiviral resistance in the case of HCV 3a in Peshawar, KP province and its evolutionary relationship with other regional or global isolates. As the prevalence of HCV 3a has been reported to be very high among the general population of Peshawar, therefore, we embarked on the current study to investigate it at molecular level by Type-specific assay, sequencing of the Core gene and phylogenetic analysis in order to have a clear epidemiological picture of the prevalent HCV 3a isolates.

Sampling
Chronically infected Hepatitis C virus infected patients enrolled under the Prime minister Hepatitis C control program at three Tertiary care units [Khyber Teaching Hospital Peshawar, Lady Reading Hospital Peshawar, Hayat Abad Medical Complex Peshawar] willingly participated in this observational study. A non-probability convenient sampling technique was used to collect data. Written consent was obtained from all the eligible study participants for participation and publication of the data and the Institutional Ethics Committee (Khyber Medical University Ethics Board) approved the study. Blood samples were collected from the patients in sterile vacutainers and sera extraction was carried out at the Institute of Biotechnology and Genetic Engineering Peshawar.

RNA Extraction
Total RNA was extracted from each sample using Favor-Gen RNA isolation Kit (Favorgen Biotech corp, Taiwan, CAT No FAVNK 001) according to the manufacturer's instructions. The isolated RNA was stored either stored at-80C or immediately used for RT PCR.

HCV Genotyping
Reverse Transcription PCR followed by Type-specific PCR for identification of HCV subtypes was carried out essentially according to [23] with modifications in the Universal and Type-specific primers according to the latest HCV sequencing data ( Table 1).

Amplification of HCV 3a Core gene
A representative number of samples (58/192), which turned out positive for HCV 3a by the Type-specific assay described earlier, were used for a second round of RNA extraction, RT PCR and subsequent regular nested PCR for the amplification of Core gene using genespecific primer [24].
Cloning and sequencing of the Core gene The Core gene products were eluted from agarose gel using Pure Link™ Quick Gel Extraction Kit (https:// www.thermofisher.com/pk/en/home.html). The eluted products were cloned in pGEMT-Easy vector and transformed in DH5-alfa strain of E.coli (Promega). Sequencing PCR of the Core gene was carried out using vector-specific sense and antisense primers in a sequencing PCR (Big Dye Deoxy Terminator method). Bidirectional sequencing run was performed using ABI PRISM 310 DNA sequencer [Applied Biosystems].

Blast analysis
In case of each isolate, three independent clones were sequenced. The respective sequences were aligned and

Phylogenetic analysis
HCV Core gene sequences from various geographical regions of the world were retrieved from the online sequence repositories and Phylogenetic tree was constructed by using Mega6 [25].

Statistical analysis
SPSS version 20 was used for data analysis. The qualitative variables were described using percentage and the quantitative variables were described using the mean, median and standard deviation.

Results
For identification of HCV genotype 3a infection among the actively infected HCV samples, RT PCR followed by Type specific nested PCR based genotyping assay was carried out.

Discussion
Accumulation of nucleotide substitutions in the HCV genome results in diversification and evolution into seven major genotypes and a series of subtypes [26]. The classification of HCV by viral genotype is not only important for appropriate treatment regimen and assessment of global viral evolution but their epidemiological data can reveal transmission activity and migration movement of infected individuals from the endemic area [27][28][29]. These viral types and subtypes show differing distribution in different geographic regions which have provided investigators with an epidemiologic marker that can be used to trace the source of HCV infection in a given population [30]. Sequencing and phylogenetic analysis of HCV Core gene nucleotide sequences has earlier been used for identification and classification of HCV isolates into genotypes and subtypes [19]. HCV genotype 3a is the most abundant in Pakistan and it has earlier been documented that HCV genotype 3a is relatively more responsive to therapy as compared to genotype 1 or 4 [28,29,31]. However, in Pakistan resistance is being observed in case of relatively responsive and easy to treat HCV genotype 3a [32]. The present study is the first report describing the genotypic and evolutionary analysis of HCV 3a isolated from chronic HCV infected patients of Peshawar, based on a modified genotyping assay and subsequent sequencing and phylogenetic analysis. Investigation of the different circulating genotypes and their evolution is not only crucial for epidemiological and clinical analysis but might be helpful for the improvement of diagnostic tests and treatment regimens [33]. HCV infection is highly prevalent in Pakistan with an overall prevalence rate of 5 % [34]. The estimated HCV prevalence in KP province is 1.1 % [8]. In this study, out of total 510 seroactive patients investigated, 82.75 % were actively infected with HCV having high proportion of HCV genotype 3a (45.50 %) which is in agreement with most recent studies in Peshawar and Pakistan reporting HCV 3a to be the most prevalent genotype [35,36], however the percentage prevalence is lower than the previous reports claiming much higher prevalence of HCV genotype 3a (60-74 %) [22,37]. One possible explanation for this changing HCV genotype 3a landscape could be the change in epidemiological pattern over times as a result of people migration [38] or the inadequate sensitivity of old genotyping assays to correctly type the circulating viral strains due to substantial genetic heterogeneity inherent to RNA viruses [26]. Moreover the emerging 69 Fig. 1 Phylogenetic tree of HCV 3a Core gene sequences constructed by Maximum Likelihood algorithm (MEGA 6 Software) with Bootstrap values shown on the branches. Tree shows the phylogenetic relationship of twelve newly reported sequences, marked in red with 36 Core gene sequences from different geographical regions. The isolate/country/Accession no of the sequences are shown in figure resistance to interferon therapy experienced in case of HCV genotype 3a [32] might possibly be due to the less prevalent genotypes other than HCV 3a or variants of HCV 3a evolved overtimes which can cause substantial economic and health burden over the infected population. Internationally accepted guidelines for the treatment of hepatitis C are rarely followed in KPK and people undergo successive therapies once the initial response of antiviral drugs is negative [32]. Due to lack of awareness, the practice mentioned above is not only causing economic and health related losses but is also contributing towards the evolution of more resistant HCV 3a types.
Molecular evolutionary analysis of the obtained viral genome sequences revealed that some HCV 3a isolates of Peshawar clustered closer to local isolates ( Fig. 1) indicating the previous existence of similar types in other parts of Pakistan, which may have spread to Peshawar, KPK province via various transmission routes. Some of the HCV 3a isolates grouped with European, south Asian, Iranian and Chinese HCV 3a isolates (Fig. 1). Peshawar city is located in the northwestern region of Pakistan. It shares international border with the Afghanistan which has been home to conflicts for the past 40 years resulting in migration of various ethnic groups into and out of Afghanistan. These migrations have changed epidemiological patterns of various pathogens overtimes including HCV. Phylogenetic analysis indicate that foreign presence in Afghanistan and migration of Afghanis to Peshawar might have contributed towards the spread of isolates which are genetically closer to European, Indian, Chinese and Iranian isolates as HCV genotype 3a is also highly prevalent in neighboring countries including China [38], India [39] and Iran [40].
This study has some limitations. Due to convenient sampling, selection bias might have occurred. Moreover we have characterized HCV 3a entirely on the bases of partial core gene sequences. To elucidate the epidemiology of emerging HCV 3a in Peshawar and to further improve the accuracy of diagnostic assays and treatment regimens, there is a need to analyze complete coding sequences of more diverse regions of HCV genome on a comparatively large scale.

Conclusion
This study concludes that HCV 3a is highly prevalent in Peshawar, Pakistan and its phylogenetics based on Core gene sequences indicate the prevalence of different lineages of HCV 3a in Peshawar which may have consequences for disease management strategies causing more economic pressure on the impoverished population due to possible antiviral resistance.

Availability of data and materials
The datasets supporting the conclusions of this article are available in the GenBank [National Center for Biotechnol- ogy

Competing interests
None of the authors have any competing interests in the manuscript.
Authors' contributions IA and JA designed the study. AG, IA, FZ and NZ carried out the experimental work. JA, IA and IAK helped with data analysis, writing the MS and reviewing it thoroughly. All authors read and approved the final MS.
Authors' information IA is a Fulbrighter working at CIIT Islamabad as Associate Professor and Leader of the Virology group. JA is a professor and director of the Institute of Basic Medical Sciences, Khyber Medical University Peshawar. His major interest is microbiology. AG is a PhD scholar while FZ and NZ are graduate students who have worked in IA's lab. IAK is a professor at the University of Agriculture Peshawar and his major interest is entomology and microbiology.