MicroRNA Profile of HCV Spontaneous Clarified Individuals, Denotes Previous HCV Infection

Factors involved in the spontaneous cleareance of a hepatitis C (HCV) infection are related to both HCV and the interaction with the host immune system, but little is known about the consequences after a spontaneous resolution. The main HCV extrahepatic reservoir is the peripheral blood mononuclear cells (PBMCs), and their transcriptional profile provides us information of innate and adaptive immune responses against an HCV infection. MicroRNAs regulate the innate and adaptive immune responses, and they are actively involved in the HCV cycle. High Throughput sequencing was used to analyze the miRNA profiles from PBMCs of HCV chronic naïve patients (CHC), individuals that spontaneously clarified HCV (SC), and healthy controls (HC). We did not find any differentially expressed miRNAs between SC and CHC. However, both groups showed similar expression differences (21 miRNAs) with respect to HC. This miRNA signature correctly classifies HCV-exposed (CHC and SC) vs. HC, with the has-miR-21-3p showing the best performance. The potentially targeted molecular pathways by these 21 miRNAs mainly belong to fatty acids pathways, although hippo signaling, extracellular matrix (ECM) interaction, proteoglycans-related, and steroid biosynthesis pathways were also altered. These miRNAs target host genes involved in an HCV infection. Thus, an HCV infection promotes molecular alterations in PBMCs that can be detected after an HCV spontaneous resolution, and the 21-miRNA signature is able to identify HCV-exposed patients (either CHC or SC).


Experimental Section
Samples were recruited from Hospital Universitario Virgen de Valme (Seville), Hospital Universitario Marqués de Valdecilla (Santander), and Hospital Universitario Infanta Leonor (Madrid) from 2014 to 2017. All samples were processed at the National Center for Microbiology (Majadahonda). The study protocol conformed to the ethical guidelines of the 1975 Declaration of Helsinki as reflected in a priori approval by the Ethics Committee of Institute of Health Carlos III (CEI PI 11_2015-V4), and written informed consent was obtained from all patients involved.

Patient Groups
Ninety-six Caucasian patients were recruited and classified in three groups: a) HCV spontaneous clarifiers (SC), individuals who spontaneously resolved acute HCV infection (serum positive antibody and negative PCR); b) chronic hepatitis C (CHC) treatment-naïve patients (detectable HCV RNA by PCR); and c) healthy controls (HC) that were never infected with HCV (antibody and PCR negative). The exclusion criteria included the following: pregnancy; individuals below 18 years old; previously HCV treated; advance liver fibrosis; clinical evidence of hepatic decompensation; active drug or alcohol addiction; alcohol-induced liver injury; HBV active infection; anti-HIV antibody; opportunistic infections; and other concomitant diseases such as diabetes, nephropathies, autoimmune disease, hemochromatosis, cryoglobulinemia, primary biliary cirrhosis, Wilson´s disease, α1-antitrypsin deficiency, and neoplasia.

Clinical Records
HCV-related clinical and epidemiological data were obtained from medical records as the year of infection, time since spontaneous clarification, route of transmission, fibrosis stage, HCV viral load, HCV genotype, and genotype of rs12979860 polymorphism at interferon lambda 4 (gene/pseudogene) (IFNL4). The liver stiffness measurement (LSM) was assessed by transient elastometry (FibroScan®, Echosens, Paris, France) and expressed in kilopascals (kPa) (1). Subjects were stratified according to cut-offs of LSM: <7.1 kPa (F0-F1 absence or mild fibrosis) and 7.1-9.4 kPa (F2 significant fibrosis). The HCV-RNA viral load was measured by quantitative polymerase chain reaction (qPCR) (Cobas Amplicor HCV Monitor Test, Branchburg, NJ, USA; and COBAS AmpliPrep/COBAS TaqMan HCV test). The results were reported in International Units per milliliter (IU/mL), with a lower limit of detection of 10 IU/mL. In the case of patients with a previous history of intravenous drug use (IDU), the year of infection was estimated since the first year they shared needles and other injection paraphernalia (2).
Clinical characteristics of metabolic status were also recorded; the body mass index (BMI) was calculated as the weight in kilograms divided by the square of the height in meters, and the following biochemical parameters were also recorded: glucose level, total cholesterol (TC), low density lipoprotein (LDL), high density lipoprotein (HDL), and triglycerides (TG). Homocysteine has been evaluated as a measure of lipid peroxidation, and values higher than 15 µmol/L were considered high. Different lipid ratios were calculated to assess lipid metabolism deregulation: LDL/HDL, normal values below 2.0; AI, atherogenic index calculated as TC/HDL, with cut-off values of low risk <5% for men and <4.5% for women, moderate risk 5-9% for men and 4.5-7% for women; AIP, atherogenic index for plasma considering AIP > 0.21 as high risk; and LCI, lipoprotein combine index defined as the ratio of TC * TG * LDL to HDL-C.

Total RNA Isolation and Quality Control
Peripheral venous blood samples were collected in EDTA tubes, and PBMCs were isolated within the first 4 hours after extraction in order to minimize miRNA expression profile. Total RNA including miRNAs were isolated with the miRNeasy Mini kit (Qiagen). RNA was eluted in 30 ul of nuclease-free water, and the RNA concentration was measured by Nanodrop. Quality and integrity were evaluated by the Bioanalyzer 2100 with Agilent RNA 6000 Nano kit (Agilent, catalog no. 5067-1511). Only the samples with a high quality of RNA (RIN > 7.5) were sequenced. Fragments with insert sizes of 18 to 36 bp were cut from the gel, and DNA was purified, quantified, and pooled for multiplexed sequencing. Final libraries were analyzed using Agilent DNA 1000 chip to estimate the quantity and to check the size distribution and were then quantified by qPCR using the KAPA Library Quantification Kit (ref. KK4835, KapaBiosystems) prior to amplification with Illumina's cBot. Sequencing was performed in the Illumina HiSeq2500, Single Read, 50nts (1x50). Four pools of 24 barcoded samples were prepared, and each pool was sequenced on one line of the same run to limit the batch effect.

MiRNA Validation by Quantitative RT-PCR
Five hundred ng of total RNA was reverse transcribed into complementary DNA with the qScript microRNA cDNA synthesis kit, following the manufacturer instructions. Briefly, miRNAs were polyadenylated, and the poly(A) tailed miRNAs were converted into first-stranded cDNA with an oligo-dT adapter primer. Individual miRNAs were quantified in a SYBR green quantitative PCR reaction by PerfeCta SYBR Green SuperMix (2x) (Quantabio) with a forward primer (sequence of de selected mature miRNA) and the PerfeCta universal PCR primer (specific to the unique sequence of the oligo-dT adapter primer). Sequences of the miRNA nucleotides were extracted from the miRBase Reslease 21 (www.mirbase.org) (1). The PCR efficiency of each miRNA amplicon was evaluated with a standard curve using a 10-fold dilution series of total RNA from a donor buffy. Only miRNA amplicons with an efficiency higher than 1.9 were accepted.
Real time reactions were performed on a Roche LightCycler 480 with a 96-well reaction plate. There is no universal endogenous control for every tissue, and the suitable endogenous control has to be validated for each sample set (2). Endogenous control selection was performed according to the stability of their gene expression in the three groups of study with geNorm(3), BestKeeper (4), and Normfinder (5). Five small nuclear RNA (snRNA) commonly used as endogenous control from miRNA expression studies were tested for suitability: the RNA U1 (RNU1), U2 (RNU2), U6 (RNU6), U44 (RNU44), and the small nucleolar RNA C/D box 48 (SNORD48).
Raw data was exported to Factor qPCR to normalize and remove multiplicative between-run variations of the PCR experiment with multiple plates. (6). Data were analyzed using the ΔΔCT method.

Data Processing Pipeline: Bioinformatics Analysis
We set up a specific bioinformatic pipeline for our data, which includes the following steps.

Known miRNA Identification from High-Throughput Sequencing Data
The raw data were initially filtered out for reads with ambiguous base calls, which did not meet the Illumina chastity filter based on quality measures, and the reads were sorted on sample type based on matches to the multiplex tags. Quality control of the remaining sequences was performed by using FastQC (v0.11.3)(3). Adapter sequences as well as low-quality base calls (q < 20) were trimmed with cutadapt (v. 1.13). Adapter-trimmed reads were processed with miRDeep2 (v. 0.0.7) (4), which identifies known and novel miRNAs from the dataset. This software maps processed reads to the human reference genome (GRCh38) (mapper.pl module) based on Bowtie1. Only the alignments with 0 mismatches in the seed region and those that do not map to more than five different loci in the genome were retained. Quantification was performed with the quantifier.pl module, which maps the reads to miRNA precursors (obtained from miRBase v20 which contains 1917 precursors and 2654 mature sequences) and determines the expression of the corresponding miRNAs. Only miRNAs with a minimum of total 100 counts in all samples were retained.

De Novo miRNA Identification
The miRDeep2.pl module was used to discover potential novel miRNAs. Briefly, reads were aligned to the human reference genome and candidate pre-miRNA sequences were extracted and score-assigned based on the ability of the precursor to fold to a pre-miRNA-like secondary hairpin structure (4). Mature miRNAs from related species such as Gorilla gorilla, Pongo pygmaeus, Pan troglodytes, and Pan paniscus were used to predict with a higher confidence the new miRNAs. A miRDeep2 score was assigned to each predicted miRNAs, which represents the probability that the sequence is a true miRNA precursor based on the theory of miRNA processing by dicer as well as the actual data and the alignment pattern of the reads. Only those miRNAs that fulfill the following criteria were selected: (1) A miRDeep2 score cut-off of >4; (2) an estimated probability that the miRNA candidate is a true positive > 0; (3) the total read counts of the predicted mature are >100; and (4) a significant randfold p-value of the excised potential miRNA hairpin (5).

Multivariate Data Analysis
We used dimensional reduction methods as the Principal Component Analysis (PCA) to visualize whether the experimental samples were clustered according to the groups of patients and to identify unwanted sources of noise. We used the NOISeq PCA function, which allows us to plot the loading values, that is, the projection of the genes on the new principal components or the scores, which are the projections of the samples (observations) on the space created by the new components.

Statistical Analysis for Significant Differentially Expressed miRNAs
Currently, there are not specific software packages designed to normalize miRNA sequencing data; for this reason, three normalization methods commonly used for RNA sequencing analysis were tested.
Normalization methods: (1) reads per kilobase million (RPKM) by DESeq (6) (v 1.28.0); (2) trimmed mean of M-values normalization method (TMM) by edgeR (v 3.18.1) (7); and (3) upper quantile normalization (UPERQ) by NOISeq (v 2.14.1) (8). Each of these methods is described briefly: (1) DESeq uses a negative binomial distribution of the count data to perform a differential expression analysis and incorporates data-driven prior distributions to estimate the dispersion and fold changes (6). (2) TMM normalization takes into account the composition of the RNA population being sampled, which is neglected in total count scaling, and gives us the proportion of counts for a specific target across all samples. edgeR fit a negative binomial generalized log-linear model to read counts for each gene. Both methods are parametric, we also included a nonparametric method in the comparison: (3) Quantile normalization is nonscaling and assumes that the overall distribution of signal intensity does not change. NOISeq for biological replicates (NOISeqBIO) corrects data by length and implements an empirical Bayes approach that improves the handling of biological variability specific to each gene.
Significantly differentially expressed (SDE) miRNAs were calculated by a fold change (FC) of >2 and a statistically significant t test p-value < 0.05 adjusted by the false discovery rate (FDR) using the Benjamin-Hochberg correction. The testing for differentially expressed genes was performed as follow: (1) DESeq fit a generalized linear model for each miRNA with the fold change estimate shrunken by empirical Bayes; the function nbionmTest was used to estimate differences between groups. (2) edgeR used a likelihood ratio test for determining the differential expression among groups of patients, and it is performed with the glmFit() and glmLRT() functions. (3) NOISeq is a nonparametric method that improves the control of the high FDR in experiments with biological replicates, inspired by the work of Efron et al. (9). Low-count filtering was performed by the Wilcoxon test method, which is recommended for our number of samples. The probability of differential expression is equivalent to 1-FDR, where FDR is considered an adjusted p-value (0.05).

Venn Diagram Analysis and Clustering
A Venn Diagram was performed to determine the overlapping SDE miRNAs, which were defined as common SDE miRNAs in the three method of normalization and statistical analysis. The overlapping list of each group of comparison was determined as the SDE miRNAs for subsequent analysis.
An unsupervised hierarchical clustering of differentially expressed genes between patient groups was performed using hclust algorithm (heatmap3, R package) with the average linkage rule (UPGMA method).

miRNA-Based Target Prediction and Pathway Enrichment Analysis of the Target Genes
The web-based computational tool DIANA-miRPath v3.0 (10) was used for the in silico target identification of the SDE miRNAs. The analysis was based on experimentally supported targets predicted in silico and annotated in DIANA-TarBase v.8.0 (11). This tool also performs a pathway union analysis of those miRNAs targets, which is performed for Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways (12). Enrichment p-values (Fischer's exact test with hypergeometric distribution) were corrected for false discovery rate (FDR) and were considered significant when adjusted for p ≤ 0.05.
Next, the SDE miRNAs were subjected to a target-based pathway enrichment analysis to identify key nodes and edges in each set of data, and miRNA-mRNA regulatory networks were identified. We used the web-tool miRNet (13), which integrates data from eleven different miRNA databases and allow us to analyze miRNA-target interaction networks and functions.

HCV-Related Genes
We have analyzed the implication of the 21 SDE miRNAs in targeting host genes related to HCV infection, and nine out of 21 SDE miRNAs were identified. The most relevant miRNA in this analysis was has-miR-124-3p, which was upregulated in HCV-exposed individuals, and it targets 18 HCV infection-related genes. Here, we review the most relevant targets of this miRNA and their biological functions.
HCV interacts with components of the IFN alpha/beta pathway to inhibit the production of IFN and the establishment of an antiviral state (14). The hsa-miR-124-3p can target seven members of this signaling pathway: the toll like receptor 3 (TLR3), the inhibitor of nuclear factor kappa B kinase subunit epsilon (IKBKE), the eukaryotic translation initiation factor 2 alpha kinase 2 (EIF2AK2), the interferon-alpha/beta receptor 2 (IFNAR2), the tyrosine kinase 2 (TYK2), the protein inhibitor of activated STAT 1 (PIAS1), and RELA proto-oncogene (RELA). TLR3 is restricted endosomes, and it recognizes dsRNA associated with a viral infection to induce the activation of NF-kappaB and IFN production (15). Upon TLR3 engagement with its ligand, IKBKE is activated (15). The IKBKE inhibits T cell immune response (16), which is essential for regulating antiviral signaling pathways. Several viruses interact directly with and inhibit IKBKE/IKK-epsilon to prevent IRFs activation. IKBKE kinases activate several transcription factors such as the nuclear factor-kappaB (NF-kB). RELA is a subunit of the nuclear NF-Kb, which is involved in multiple aspects of innate and adaptive immune functions (17). RELA is required during a key early phase after virus infection (18), where it is crucial for early IFN-beta expression and resistance to RNA virus replication. The coordinated action of NF-Kb together with other transcription factors leads to the induction of IFN and proinflammatory cytokines and to the establishment of the innate immune response (19). HCV has developed strategies to disrupt the induction of IFN and cytokine pathways through viral protein interaction, favoring viral propagation and presumably HCV chronic infection (19). Secreted IFNs bind and activate the IFNAR2 in an autocrine and paracrine manner, which leads to the induction of hundreds of interferon stimulated genes (ISGs). IFNAR2 gene has been identified as overexpressed in PBMCs of HCV chronic patients, and its expression significantly correlates with the effectiveness of interferon therapy independently of the viral load (20). TYK2 encodes a protein associated with the cytoplasmic domain of type I and type II cytokine receptors and IFNAR and promulgate cytokine and IFN signals by phosphorylating receptor subunits. As such, it may play a role in the anti-viral immunity. IFN activates the signal transducer and activator of transcription (STAT) pathway to regulate immune response. The protein inhibitor of activated STAT 1 (PIAS1) is a negative regulator of STAT1, being critical for the IFN-gamma-or beta-mediated innate immune response (21). PIAS1 has a dual function, acting at the virus-induced early singling stage and IFN stimulated amplifying stage. Thus, PIAS1 maintains proper amounts of type I IFNs and retains its magnitude when the antiviral response intensifies (22). EIF2AK2 is a serine/threonine protein kinase that play a key role in the innate immune response to viral infection as well as is involved in the regulation of signal transduction in different pathways as well as in the expression of cytokines and IFNs. It has an antiviral activity on HCV, among others.
On the other hand, through TLR3 activation, the PI3K/AKT signaling pathway can be induced to produce ISGs (23). HCV activates PI3K-Akt signaling to enhance entry, replication, and translation (24,25). Three main factors of this pathway targeted by has-miR-124 are the phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha (PIK3CA) and the AKT serine/threonine kinase 2 (AKT2) and 3 (AKT3). PIK3CA is the catalytic subunit of the PI3K, and it is involved in cell proliferation, apoptosis, and angiogenesis. Mutations within its gene have been identified in HCC in different populations (26). AKT2 and AKT3 are key effectors of the insulin signaling pathway and regulates cellular metabolism (27). HCV seems to activate AKT to apparently enhance the permissiveness of the cell for HCV infection (28). The assault by HCV on this pathway that regulates cellular growth and metabolism probably plays an important role in HCV pathogenesis.
HCV modulates the MAPK/ERK cascade for its own propagation (29). Eight genes within this pathway can be targeted by hsa-miR-124-3p: the growth factor receptor bound protein 2 (GRB2), SOS Ras/Rho guanine nucleotide exchange factor 1 (SOS1) and 2 (SOS2), A-Raf proto-oncogene, cytosolic serine/threonine kinase (ARAF), the GTPase neuroblastoma RAS proto-oncogene (NRAS) (30), the mitogen-activated protein kinase 11 (MAPK11), the protein phosphatase 2 scaffold subunit Abeta (PPP2R1B), and the cyclin dependent kinase inhibitor 1A (CDKN1A) also known as p21. This pathway can be active by receptor tyrosine kinases (RTK) such as the epidermal growth factor receptor (EGFR), which is a central controller of HCV viral entrance (31). The signal received by the receptor is integrated by the RAS/RAF/MEK/MAPK cascade reaction that activates the signal transduction pathway and activates transcription factors to regulate gene expression. GRB2 is the downstream effector of the receptor and, together with SOS1 and SOS2, regulates the action of RAS protein. Previous report has detected an interaction of the HCV NS5A protein with GRB2 to inhibit downstream mitogen signaling (32). In vitro assays of Huh7.5-infected cells (33) identified SOS1 as one internal driving factor leading to the extrahepatic manifestation of an HCV infection. SOS1 activate the MAPK signaling pathway through RAS activation. RAS proteins act as molecular switches and are involved in several signaling transduction pathways including MAPK and the previously mentioned PI3K. NRAS is one of the three RAS isoforms that control cell proliferation and growth. Once RAS is activated, it recruits and activates the protein kinases RAF, a family of enzymes (such as ARAF) that activates ERK signaling by phosphorylating MEK, which in turns activate MAPK. MAPK11 is also activated in the presence of an HCV core protein (34,35). In vitro studies in human hepatoma cells have previously reported that a low level of Ras-ERK signaling activity is required for HCV RNA replication but that a complete inhibition is inhibitory (36). The protein phosphatase 2 scaffold subunit Abeta (PPP2R1B) is a regulator of this pathway. Different viruses such as HCV interact with PP2R1B to specifically subvert key survival pathways for the host (37). Finally, CDKN1A is a regulator of the cell cycle and is frequently mutated in human cancers, such as HCC. In addition, the HCV core protein downregulates p21 in human hepatoma cells (38).
Therefore, the upregulation of hsa-miR-124 by an HCV infection will repress the RAS-MAPK signaling pathway at different levels to ultimately regulate HCV RNA replication.
An HCV infection requires multiple host signaling pathways to successfully infect the host cell. Our results suggest that the subverts of miRNAs are an additional strategy of HCV to maintain infection (39) as the upregulation of hsa-miR-124 in PBMCs by HCV downregulates IFN production at different stages.