Using next-generation sequencing of microRNAs to identify host and/or pathogen nucleic acid signatures in samples from children with biliary atresia – a pilot study

Biliary atresia (BA) is a progressive disease affecting infants resulting in inflammatory obliteration and fibrosis of the extra- and intra-hepatic biliary tree. BA may be grouped into type 1 isolated; type 2 syndromic, where other congenital malformations may be present; type 3 cystic BA, where there is cyst formation within an otherwise obliterated biliary tree; and cytomegalovirus-associated BA. The cause of BA is unclear, with immune dysregulation, inflammation and infection, particularly with cytomegalovirus (CMV), all implicated. In this study a total of 50/67 samples were tested for CMV DNA using quantitative real-time PCR. Ten liver tissue and 8 bile samples from 10 patients representing the range of BA types were also analysed by next-generation sequencing. CMV DNA was found in 8/50 (16 %) patients and a total of 265 differentially expressed microRNAs were identified. No statistically significant differences between the various types of BA were found. However, differences were identified in the expression patterns of 110 microRNAs in bile and liver tissue samples (P<0.05). A small number of bacterial and viral sequences were found, although their relevance to BA remains to be determined. No direct evidence of viral causes of BA were found, although clear evidence of microRNAs associated with hepatocyte and cholangiocyte differentiation together with fibrosis and inflammation were identified. These include miR-30 and the miR-23 cluster (liver and bile duct development) and miR-29, miR-483, miR-181, miR-199 and miR-200 (inflammation and fibrosis).


InTRoduCTIon
Biliary atresia (BA) is a progressive disease affecting infants resulting in inflammatory obliteration and fibrosis of the extra-and intra-hepatic biliary tree. Worldwide incidence ranges from 1 : 50 00 to 1 : 20 000 live births, with 1 in 17 000 in the UK, equating to about 50 infants annually [1]. Surgical management involves carrying out a Kasai portoenterostomy, whereby the damaged extrahepatic biliary tree is removed in an attempt to restore bile flow from the native liver via residual bile ducts at the porta hepatis. This may initially be successful and clear the jaundice in 50-60 % of infants, although by adolescence, most affected patients require a liver transplant [2]. BA may be grouped by clinical phenotype into four variants: type 1 isolated, or perinatal BA (75-80 %); type 2 syndromic, or embryonic BA (7-10 %) with other congenital malformations, e.g. biliary atresia splenic malformation (BASM); type 3 cystic BA (5-8 %) with cyst formation within an otherwise obliterated biliary tree; and cytomegalovirus (CMV)-associated BA [3].
Most studies investigating an infectious cause of BA suggest an association with hepatotropic viruses, such as reovirus, rotavirus and, in particular, CMV [4,5]. However, the relationship between CMV infection and BA is not conclusive; Fischler et al. [6] compared the impact of ongoing CMV infection at presentation of BA on the long-term outcome after the Kasai procedure. Twenty-eight patients with BA were investigated and no differences in survival with native liver or in survival after liver transplantation were found. Other viruses have also been implicated, including hepatitis B, hepatitis C, adenovirus, human papillomavirus (HPV), human herpes virus 6 (HHV6), Epstein-Barr virus (EBV) and BK virus [7][8][9]. These studies again are not conclusive and suggest that the presence of viruses may be a secondary phenomenon in BA. In addition to trying to identify an infectious cause of BA, a number of studies have implicated immune dysregulation as a pathogenic factor. There is also a significant amount of evidence that inflammation is a major factor [9][10][11][12][13][14][15][16].
In this study microRNA profiling using next-generation sequencing (NGS) was chosen as an unbiased approach to biomarker discovery [17,18]. MicroRNA profiling was specifically selected, since it has a number of advantages: (i) the cell machinery required to enable posttranscriptional targeting by virally encoded or endogenous miRNAs is conserved in all eukaryotic cells, including hepatocytes [19]; (ii) miRNAs are associated with a number of human pathological conditions and are highly stable; (iii) they are produced by pathogens, including viruses, from which the genomic sequence signal has been lost, but due to the stability of miRNA, may be detected. It is also thought that while the biogenesis and mode of action of viral miRNAs may be similar to those of eukaryotic organisms, their target strategy may be different from that of their hosts. In this way viral miRNAs may have evolved to regulate a small number of host genes essential to their survival and if no viral target sequences can be found, it may still be possible to identify the virus by detecting the host response.
The aim of the study was to investigate whether an infection, immune dysregulation or other host genetic factors played a part in the development of BA in a paediatric cohort. Due to the lack of true non-BA controls, the isolated, idiopathic form was compared with samples representing BASM, choledochal malformations and CMV-associated BA.

Ethical statement
All samples used in the study were obtained from the Department's BioBank with ethical approval, together with approval from the parents or guardians.

Patients
In total, 230 samples from 67 children admitted with BA for a Kasai portoenterostomy procedure between 6 September 2010 and 11 July 2014 were collected (see Table 1). The median age was 66 days at sample collection (range 21-5763 days); 27/67 (40.3 %) were male. Each patient was given a unique study number.

CMV dnA analysis
Samples were extracted using a QIAamp One-For-All Nucleic Acid kit on the Qiagen BioRobot MDx according to the manufacturer's instructions (Qiagen, Manchester, UK). A total of 50/67 patient samples were tested for the presence of CMV DNA using the clinical laboratory diagnostic quantitative real-time PCR assay based on previously described methods [20, 21, modified with the inclusion of a feline herpesvirus extraction/inhibition control, added to the sample prior to extraction. Briefly, Qiagen Norox QuanitiTect Multiplex PCR Mix was used with 10 and 15 pmol of the CMV forward and reverse primers, together with 15 pmol of the FAM/BHQ1labelled probe, together with 5 pmol of the forward and reverse primers and 1 pmol of the ROX/BHQ2-labelled feline herpesvirus probe in a 30 µl total reaction volume, which included 10 µl of the extracted sample. Reactions were run on a Rotor-Gene 6000 using the following cycling conditions: one cycle of 95 °C for 15 min, followed by 45 cycles of 95 °C for 15 s; 55 °C for 20 s (acquiring); and 65 °C for 20 s. Where there was sufficient volume, urine, whole blood or EDTA and bile samples were tested (see Table 2).

Sequencing
Sequencing libraries were prepared from liver tissue and bile samples, matched where possible, from 10 patients. These consisted of 10 liver tissue and 8 bile samples from 5 isolated BA, 2 BASM, 2 choledochal cysts and 1 CMV-associated BA patient (see Table 3). RNA was extracted using Qiagen miRNeasy kits according to the manufacturer's instructions; for liver tissues, 2×20 mg pieces were used, for bile 50 µl from babraham. ac. uk/ projects/ fastqc/); sequences were analysed using miRBase (http://www. mirbase. org/ index. shtml) and miRNA counts registered and the edgeR package (v 2.14; https:// bioconductor. org/ packages/ release/ bioc/ html/ edgeR. html) used for differential expression analyses based on the methods described by Wang et al. [22]. In each sample miRNA levels were quantified and normalized to account for the different total number of reads. For the second analysis all sequences were first compared with the human genome

CMV dnA loads and CMV IgM results
CMV DNA quantification was carried out in bile and urine samples collected from 50/67 patients. Evidence of CMV infection was found in eight (16 %) patients, three of whom were diagnosed with CMV-associated BA. Of the other five patients, three were diagnosed with isolated BA and two with choledochal malformations (see Table 2). CMV DNA was detected in the urine samples collected from two of the three patients with CMV-associated BA (3 and 64) and also in the bile of patients 3 and 7; there was insufficient bile sample from patient 64 for testing.
CMV IgM-positive results were noted in four of the six patients tested and one other was indeterminate. Two of the patients with CMV IgM-positive and indeterminate results had isolated BA. Most patients had been referred from other hospitals and for 30 of 67 there were no CMV serology results available when requested. Further samples had not been tested as a baseline. However, CMV DNA was negative in the urine samples from 17 of these patients, therefore excluding active CMV infection at diagnosis. Of the rest, there were no CMV results for any samples available. Table 3 shows details of the 18 bile and liver tissue samples from 10 patients (4 male and 6 female), together with their biliary atresia classification and initial number of reads. After adapter removal, the general pattern across all samples was of three main sequence peaks of 22 nt (corresponding to the expected miRNA length), 32 nt and one of 50 nt. However, 3 samples (bile 27, liver 3 and 19) showed very low numbers of reads (fewer than 1000 matches against known miRNAs) and were excluded from further analyses. This resulted in a total of 15 samples for analysis.

MicroRnA expression in all samples of the four biliary pathologies
A total of 265 differentially expressed miRNAs in bile and liver samples were found (Table S1 and Fig. S1, available in the online version of this article). In order to determine if there were any significant differences in microRNA expression patterns between the different forms of BA, principal component analysis was performed on the sequence data from bile and liver samples separately (see Figs 1 and 2).
No statistically significantly differences between the different forms of BA were found, that is the type 1, idiopathic forms displayed similar microRNA expression patterns to the known (choledochal malformation, BASM and the CMV-associated) forms. The PCAs do suggest a difference between the expression patterns of bile and liver tissues from the two BASM samples (5 and 6) and the other  types of BA, although the sample numbers are too small to form a definite conclusion. There were, however, statistically significant differences overall between miRNA expression between bile and liver samples for all pathologies.
Preliminary analysis of both bile and liver samples failed to show the presence of any viral or bacterial microRNA sequences. A small number of bacterial genomic sequences, including Propionibacterium propionicum (in liver samples) and Enterococcus mundtii in bile samples were noted. However, it is extremely unlikely that these have any significance for BA.

differences in microRnA expression between bile fluid and liver tissue samples
A further analysis of the differences in microRNA expression patterns associated with inflammation and fibrosis between liver and bile samples is shown in Tables 5 and 6 for liver and bile, respectively. The average fold changes for all 265 microRNAs from bile fluid and liver tissue samples were calculated and compared. After removing duplicate mature microRNAs with identical data, this resulted in the identification of 110 microRNAs with statistically significant (P<0.05) patterns of differential regulation between liver tissue and bile samples, (see Fig. 4, Table 7).

dISCuSSIon
We used NGS to determine the miRNA profiles in liver tissue and bile samples from 10 patients in a paediatric cohort of 67 children with BA to determine whether there were different expression patterns between type 1 (unknown) and the known congenital forms of biliary disease (choledochal malformation and BASM) and CMV-associated BA. As liver and bile samples are unavailable from healthy donors, those from the congenital and CMV-associated conditions were used as controls to determine any differences in expression unique to type 1BA.
We focused on examining clinical pathology samples rather than using the more commonly examined cell culture and animal model systems in the hope of finding more direct evidence for the involvement of human-and pathogenderived miRNAs. Overall, CMV infection was detected in eight (16 %) patients, three of whom were diagnosed with CMV-associated BA. Of the other five patients, three were diagnosed with isolated BA and two with choledochal malformations. It was challenging drawing together CMV serological and molecular results involving different sample types with the groupings, especially as, for example, two of the patients with CMV IgM-positive and indeterminate results had isolated BA. Most patients had been referred from other hospitals and there were no CMV serology results available when requested for 30 of 67. However, CMV DNA was negative in the urine samples from 17 of these patients, therefore excluding active CMV infection at diagnosis. Of the rest, there were no CMV results for any samples available, making any further interpretation difficult other than from the findings of other diagnostic tests carried out. Overall, it could be concluded that as CMV is such a ubiquitous infection, it is likely that there will be a group of children with BA in whom it is difficult to elucidate the pathogenesis in the presence of more than one possible aetiology.
We were particularly interested to see if there was a distinct pattern of expression in the CMV-associated form of biliary atresia and also in finding clear evidence that delineated type 1 BA. Liver tissue and bile samples were analysed to maximize the discovery potential of the investigation. Although the extrahepatic biliary system is involved in the early stages of BA, the intrahepatic bile ducts are subsequently involved in the majority of patients [23]. As the disease progresses, the fibrous obliteration of the bile ducts causes the bile to become trapped and it is likely that the resulting damage to hepatocytes and cholangiocytes contributes to the miRNA patterns found in the bile fluid. Thus by examining bile fluid and liver tissues, a more accurate profile of miRNA expression relating to BA may be found.
A broad overview of the statistically significant differences in microRNA expression is shown in the heat map (Fig. 4). This highlights the clear differences between bile and liver tissue samples and separates the expression patterns into two main groups: (i) high levels of expression in bile, low in liver and (ii) high levels of expression in liver, low in bile. The heatmap (Fig. S1) shows the expression patterns of all 265 microRNAs found in all samples. Here again broad patterns of expression can be seen, with an intermediate pattern, a low-level expression pattern and high levels of expression for both bile and liver samples. In the following sections some of the more intensively studied microRNAs that were also found in our work are briefly discussed in the context of the published literature.

Mir-30
Hand et al., [24] identified upregulation of the miR-30 family during the late stages of murine foetal development, where miR-30a expression in the ductal plate regulates TGF-β, essential for the normal differentiation of hepatoblasts into cholangiocytes and hepatocytes. Failure of ductal plate morphogenesis is associated with approximately 25 % of cases of biliary atresia [24]. Our experiments showed relatively high levels of miR-30a in all liver samples, while miR-30c and 30b were expressed at much lower levels (from 4-fold to 26-fold lower; see Table 5). Rogler et al. (2009) [25] showed that bile duct formation in foetal murine hepatocytes was repressed by high levels  of miR-23b, miR-27b and miR-24-1. By downregulating Smad (in particular Smad 4) expression and consequently TGF-β signalling, bile duct gene expression was reduced and hepatocyte growth promoted. Conversely, low levels of miR-23b in cholangiocytes permitted TGF-β signalling and bile duct development. We found low levels of miR-23b and 24-1, consistent with the development of bile ducts, but high levels of miR-27b, suggesting the suppression of bile duct gene expression and a shift toward hepatocyte differentiation.

Evidence of miRnA expression in response to infection
There are a number of studies investigating the relationship between infection and host miRNA response in liver, mainly focusing on CMV, hepatitis B virus and hepatitis C virus, e.g. [26][27][28][29][30][31]. Other hepatotropic viruses have been less well studied in this respect, but there have been limited reports on rhesus rotavirus in murine BA models [32,33] and liver injury associated with HIV-1 infections [34].
We found evidence of CMV infection in eight patients using serology and quantitative CMV DNA testing in whole blood, urine and bile samples. It was interesting to find that CMV infection was diagnosed in three infants with CMV-associated BA, as well as three with isolated BA and two with choledochal malformations. Bearing in mind that the true incidence of congenitally acquired CMV infection is not known, but likely to be underdiagnosed, one could speculate whether CMV-associated BA should be seen more frequently given the low incidence of BA. This does not exclude the involvement of CMV in the aetiology of BA, especially as there is clear evidence of CMV infection in our cohort. However, NGS did not detect any CMV or other known viral miRNA sequences, or any specific host miRNAs that differentiated the CMV-associated samples from the congenital forms. It is possible that looking for a CMV-specific immune response, rather than the genome, might be helpful if biliary atresia was due to an infectious 'hit and run' pathogenesis.
The upregulation of miR-17, -20, -96, -182 and -185 in response to CMV infection has been described [27,28,30]. In our work, miR-96 was not detected and although miR-182-5p was expressed, it was at similar, low levels in all liver samples, although generally higher in bile samples (see Fig. 4, Table 7). MicroRNA-122 is considered to be a general, non-specific marker of liver injury and its levels are regulated in both liver and serum in a number of liver diseases, including hepatitis B and C virus infection and solid tumours [35]. We found miR-122-5p and miR-122-3p expression in all liver samples, except for sample 6 (BASM), likely due to hepatobiliary injury [36,37], particularly inflammation and fibrosis [38,39].

Inflammation and fibrosis
BA is a liver disease characterized by hepatic injury and inflammation, followed by fibrosis and deposition of extracellular matrix (ECM), distorting the normal liver parenchyma and leading to the obliteration of the extrahepatic bile ducts and portal hypertension [40]. Hepatic stellate cells are the main source of ECM and there have been an increasing number of publications investigating both their function and potential as therapeutic targets to reverse hepatic fibrosis, together with the role of miRNAs in regulating their activation [40]. A number of these miRNAs were identified in our experiments; see Tables 5 and 6 for expression in liver and bile samples, respectively, and also the pathway analysis (Fig. 3, Table 4).

The miR-29 family
In liver tissues, microRNA-29a-3p is the most abundant member of the family, accounting for 70 % of its overall expression [41]. For the seven samples where at least two members of the family were expressed, we found that on average 73 % of expression was accounted for by miR-29a-3p. From the network analysis (Fig. 3), the mir-29 family are predicted to be involved in numerous interactions, particularly the collagen-encoding genes and the DNA methyltransferase 1 gene (DNMT1). High levels of mir-29 have been shown to have anti-fibrotic effects by suppressing collagen 1 A1 mRNA and its protein expression, reducing HSC activation [42]. Wang et al. [43], Szabo and Bala [44] and Kriegel et al. [45] showed that low levels of mir-29 are associated with an increase in fibrosis in BA patients. Our results showed that miR-29b-3p was expressed at low levels in all samples, except sample 6 (BASM), where no expression was found. MicroRNA-29c-3p expression was also low and absent in three samples; 5 (BASM), 6 (BASM) and 8 (isolated BA).

miR-483
The cooperative roles of miR-483-3p/-5p in inhibiting liver fibrosis in transgenic mice has been shown by Li et al [46]. Their work demonstrated that by acting together, miR-483-5p/-3p target two pro-fibrosis factors; platelet-derived growth factor-b and tissue inhibitor of metalloproteinase 2, which suppress the activation of hepatic stellate cells (HSCs). Recent work by Chen et al. [47] supports these findings.    We found low levels of miR-483-5p and higher levels of miR-483-3p (up to 10-fold higher), although the differences between BA types did not reach statistical significance. The relatively low levels of miR-483-5p in our samples suggest that inflammation and subsequent fibrosis due to the unchecked activation of HSCs is possible.

MiR-181, miR-199 and miR-200
The miRNet pathway analysis (Fig. 3, Table 4) shows miR-181a-5p, 199a-3p, 200a-3p and 200b-3p associating directly with genes involved with collagen synthesis, ECM and mRNA export and stability. MicroRNA-199a, -200a and -200b were shown to be positively correlated with liver fibrosis and it was shown that there were significant differences in expression according to the classification of the fibrosis [48]. We found relatively high levels of identical expression of miR-199a-3p and -199b-3p in all samples. However, miR-199a-5p, miR-200a and miR-200b were less abundant, with a 10-30-fold reduction in expression compared to miR-199a/b-3p; these differences may be due to differences in the grade of fibrosis between patients.
Zheng et al. [49] and Yu et al. [50] found that HSC activation and the subsequent development of fibrosis is, at least in part, controlled by phosphatase and tensin homologue (PTEN) expression via a cascade of reactions involving miR-181b. It was also shown that miR-181b expression increased more than miR-181a in vitro in TGF-β-induced HSC activation [50]. However, our experiments showed up to 30-fold higher levels of miR-181a expression compared with miR-181b.
Brockhausen et al. [51] also demonstrated that miR-200 showed little or no differential expression during TGF-βdriven EMT, while miR-181a was significantly induced early on in the process. This seems to correlate with the results we found, with low expression of mir-200a/-200b and higher levels of miR-181a, potentially highlighting a microRNA signature of fibrosis in our patients.
All of these publications and their differing results highlight some of the difficulties in making useful comparisons in microRNA expression studies. In our work, the differences in expression patterns could be due to the fact that the liver tissue analysed is likely to contain a mixture of hepatocytes, cholangiocytes and HSCs, from which a very mixed population of microRNAs would have been sampled.

MicroRnA expression in bile samples
MicroRNAs in bile fluid are likely derived from both circulating exosomes and also from intact epithelial cells desquamated from the bile tract [52]. One of the difficulties in interpreting our data is the origin of the bile fluid, since the disease itself precludes the presence of distinct bile ducts. Even when sampling from patent ducts, unfractionated bile is known to contain cellular fragments, making it difficult to define a unique mircoRNA profile [53]. This is further complicated by the fact that the mechanisms controlling miRNA release into the bile have yet to be determined. However, our results demonstrate consistent and statistically significant differences between liver and bile samples, highlighting their potential value as a source of biomarkers.
Overall, there were 110 mircroRNAs statistically significantly differentially expressed between bile and liver samples (P value <0.05). Of these, 54 were more highly expressed in bile samples and 56 more highly expressed in liver samples; Fig. 4, Table 7 illustrate these differences.

MiR-486
One striking difference in microRNA levels was the high and consistent expression of miR-486-5p in bile samples (Fig. 4, S1 and Table S1), particularly in patients 3 (CMV-associated), 6 (BASM) and 19  MicroRNA-486 is known to be one of the most abundant miRNAs in red blood cells [54] There have also been a number of reports highlighting the association of miR-486 with fibrosis in the lung [55] and kidney [56,57], although no clear association with liver inflammation and fibrosis has been established yet. The differential pattern of high expression in bile and relatively low levels in liver tissue in our samples is interesting. If the microRNA is meant to protect against inflammation and fibrosis, the low levels in the liver may be a signature of such damage. The high levels in the bile may also be a key stage in the exosomal transport of mir-486-5p in response to fibrotic injury.

Conclusion
We have presented a comprehensive analysis of the microRNA expression patterns of bile and liver tissue samples from a small number of patients diagnosed with biliary atresia and control samples with CMV-associated BA, choledochal malformations and BASM. While we were unable to use samples from healthy donors, we anticipated a specific miRNA profile that distinguished between the different forms of the disease. This could either present as known or potentially novel miRNAs from pathogens or a definitive host response to infection.
No differential expression patterns were found that would enable the classification of disease types. What we did find was clear evidence of microRNAs associated with hepatocyte and cholangiocyte differentiation, together with fibrosis and inflammation, and statistically significant differences in expression between bile and liver samples. Whether the differential expression of pro-and anti-inflammatory micro-RNAs in the two sample types evaluated represents a transient shift from export to import remains to be determined. The differences (and similarities) between circulating microRNAs found in other studies and those found in bile samples in this work are of interest and also require further investigation.
There are a number of limitations to our study. As described in the Introduction we did not have access to true non-BA controls. Rather, our aim was to identify microRNA markers associated with BA and determine if there were patterns of expression that could distinguish the isolated, idiopathic BA, from the better characterized BASM, choledochal malformations and CMV-associated forms of the disease. Although the number of samples tested was low, this is a reflection of the disease prevalence across the different forms of the condition required for the analysis and this also meant that true biological replicates were unavailable. A further issue with our results was that we were unable to independently test our findings by PCR. However, our work is supported in some aspects by others and also highlights the difficulties in interpreting the differences in microRNA expression patterns and levels by the various analytical methods, particularly primer and probe sequence variation in expression arrays and comparing array-based methods and NGS-derived data. For example, there are striking differences between reports on the abundance of miR-200 and miR-181a-5p in inflammation and also difficulties in interpreting the many murine models and their relationship to human disease.
This study has revealed a number of potential biomarkers associated with the development of the liver and bile ducts and liver inflammation. We have found a set of miRNAs that appear to indicate damage to the biliary tract, rather than a specific set of indicators for any one type of biliary atresia. Those associated with liver and bile duct development (miR-30 and the miR-23 cluster) and inflammation and fibrosis (miR-29, miR-483, miR-181, miR-199 and miR-200) need to be investigated in more detail using quantitative realtime RT-PCR.

Funding information
The project was funded by the Children's Liver Disease Foundation (CLDF project reference number: NL 1752) and was also part-funded by the National Institute for Health Research (NIHR) Biomedical Research Centre at Guy's and St Thomas' NHS Foundation Trust and King's College London.