Advances in high-throughput experimental analyses have had a profound impact on biomedical research. As a consequence of the Human Genome Project, the availability of genome sequences has fueled the growth of functional genomics and systems biology/medicine, which, through large-scale integration of databases with existing scientific knowledge represents a substantial gain, since it facilitates scientific progress by identifying new associations and correlations. In this issue of Digestive Diseases and Sciences, Clarke et al. [1] report the characterization of hepatocellular carcinoma (HCC) related genes and metabolites in human nonalcoholic fatty liver disease (NAFLD) using systems biology methods.

NAFLD represents a group of multifactorial progressive liver disorders that together represent the predominant worldwide liver disease. The prevalence of NAFLD is increasing rapidly due to the related epidemics of obesity and diabetes. The disease spectrum of NAFLD ranges from simple non-progressive steatosis, to steatohepatitis (NASH), which is a more serious form of liver damage characterized by fibrosis and variable amounts of visible fat. NASH can progress further to cirrhosis and to hepatocellular carcinoma (HCC) [2, 3]. NAFLD was initially considered as the hepatic component of the metabolic syndrome due to its strong associations with insulin resistance and links to visceral obesity. NAFLD is now considered as a multifactorial condition that leads not only to increased liver-related mortality but also to increased risk of the development of type 2 diabetes mellitus and cardiovascular diseases (CVD) [4]. Abnormalities in lipid and lipoprotein metabolism accompanied by chronic inflammation are crucial for development of metabolic syndrome-related diseases where cholesterol as the lipid component lies at the crossroad of NAFLD and atherogenesis. NAFLD-associated pathologies can thus on the one hand be triggered by altered metabolism, such as is associated with insulin resistance or dyslipidemia, while on the other hand NAFLD can itself increase the risk for the development of diabetes mellitus and cardiovascular diseases [4]. This complex network of interactions has common roots in the primary metabolic pathways of the liver and of other tissues. Fatty hepatic infiltration can critically affect hepatic drug metabolism due to alteration of the nuclear receptor network [5], which in turn can affect the severity of NAFLD through alterations of drug metabolism.

Despite the apparent common origins of NAFLD, HCC, and T2D, it appears that the differing genetic and environmental factors to which each individual is exposed might explain the inter-individual variability of disease progression. This likely underlies the current discrepancy between the increasingly detailed understanding of the molecular mechanisms of the disease and the inconsistent results of global understanding that would aid clinical interventions. At present, the genome wide association studies (GWAS), transcriptome analyses, meta-analyses and other clinical studies conducted in different populations of varying ethnic backgrounds are concordant in that polymorphisms of PNPLA3 (patatin-like phospholipase domain containing 3) associate with steatosis and progression to NASH and HCC. Since PNLPA3 protein function in lipolysis is not completely understood, the challenge is to understand how loss-of-function PNLPA3 promote fat accumulation and fibrosis and even increase the risk of HCC.

By 2015 over 2.3 billion adults will be overweight. Over 20 % of the Western population, over 60 % of diabetics and over 90 % morbidly obese will develop reversible steatosis. Nevertheless, up to 15 % of the Western population, 25–30 % of obese people or those with T2D, over 35 % those severely obese (BMI over 30 kg/m2) will develop NASH [6]. The increasing incidence of NAFLD contributes to the rising incidence of HCC, contributed also by viral hepatitis and alcoholic liver disease. The NAFLD–HCC correlation indicates that metabolic abnormalities might be an important risk factor for cancer development.

Since metabolic diseases and cancer are not frequently associated, it is intriguing to consider the molecular signatures (disease associated genes, transcripts, proteins, metabolites) of each disease individually and in common. Clarke et al. [1] addressed this issue by performing a transcriptome analysis of well-phenotyped clinical samples in four categories: steatosis (n = 10), NASH with fatty infiltration (n = 9), NASH without fatty infiltration (N = 7) and normal controls (n = 19), comparing gene expression patterns (and metabolite profiles) with published data for HCC carcinogenesis, a somewhat flawed approach given that the underlying etiology of HCC was not proven to be provoked by NASH. Nevertheless, the analysis provided novel global insights into the molecular signatures of NAFLD/NASH and HCC. Hierarchical clustering (a method of data mining which forms hierarchical groups of objects with similar characteristics) did not distinguish between NASH with or without fatty liver, indicating that inflammation and fibrosis are the predominant hallmarks of NASH, with gene expression signatures distinct from steatosis. The majority of gene expression changes were observed during progression from steatosis to NASH, reminiscent of the changes observed when clustering was based on genes encoding components of drug metabolic pathways [7]. Gene set enrichment analysis revealed that extracellular matrix/angiogenesis genes are significantly upregulated whereas genes associated with iron homeostasis are downregulated [1]. Another hallmark of NASH in this study is dysregulation of the Wnt signaling pathway with downregulation of Wnt activators and upregulation of Wnt inhibitors, which is opposite from HCC. Wnt signaling is among the best characterized oncogenic pathways in HCC with activation occurring via the β-catenin mutations. An integrative transcriptome analysis from 2009 [8] identified the aberrant activation of Wnt signaling as the major feature of one of the three HCC subclasses. This is concordant with the observation that pathways in cancer were among the enriched KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways in NASH [1], suggesting that NASH initiates a sequence of carcinogenesis events which includes dysregulated Wnt signaling eventuating in HCC (Fig. 1).

Fig. 1
figure 1

NAFLD is represented by different forms, from steatosis (S), steatohepatitis (NASH) with or without the visible fat, chirrosis (C) or hepatocellular carcinoma (HCC). It is now well perceived that NAFLD and type 2 diabetes mellitus (T2D) are inter-dependent, especially due to insulin resistance and other features of the metabolic syndrome. Also new is the perception of inter-dependency of NAFLD and cardiovascular disease (CVD), where the common metabolic link is dyslipidemia. The complex network of interactions presented on this figure has been deduced from the cited references. Genes where polymorphisms associate with NASH and with HCC (PNPLA3, SOD2, TNF, GST) have been discussed [3]. Deregulation of Wnt signaling, angiogenesis, iron homeostasis and pathways in cancer were exposed in NAFLD progression towards HCC [1], together with some metabolites that are not discussed herein since they are at present difficult to link to the current genetic knowledge

Another gene expression profiling study reported that cancer-related molecular signatures are associated with NASH but not with steatosis [9]. Although the sample size of this study (10 NASH, 30 steatosis, 18 controls) was similar to that of Clarke et al. [1], the steatohepatitis group included alcoholic and non-alcoholic subgroups sharing similar hepatic morphology. Unsupervised clustering divided samples into steatosis and NASH groups, with only metabolic processes upregulated in steatosis, whereas in NASH, upregulation of the cancer progression-associated pathways such as those regulating extracellular matrix organization were present. Two of the most differentially expressed genes in NASH were aldose reductase AKR1B10 and a keratin family member KRT23 that were identified previously as marker candidates in the progression of steatohepatitis to HCC [9].

The conclusions from two expression-profiling studies, each derived from small populations, are generally in agreement, notwithstanding their identification of different gene clusters, perhaps reflecting methodological differences. A meta-analysis of both studies [1, 9] would likely augment understanding of the pathways linking NASH with HCC. It would also likely help fill the current gaps between conclusions derived from small-scale gene expression profiling studies and large-scale analysis of SNPs (single nucleotide polymorphisms) or GWAS. Nahon and Zucman-Rossi [10] reviewed the most robust and reliable published data regarding associations between genetic variants and the risk of HCC, including large-scale case–controlled studies, meta-analyses, prospective studies and GWAS. Despite limitations in combining studies with different methodological drawbacks, the review presents a comprehensive list of SNPs associated with the risk of HCC development. Impeding further progress is heterogeneity not only of study methodology, but also of the data presentation wherein the use of different nomenclatures (gene nomenclature, KEGG terms, terminology of GO—gene ontology, etc.) can add confusion and interpretational variability. Short of performing a new meta-analysis based on publically available datasets, it is not easy to conclude whether the hepatic genes that were identified in the two discussed gene expression profiling studies [1, 9] are congruent with SNPs associated with HCC [10]. Although the gene clusters associated with oxidative stress, iron metabolism, inflammatory and immune response, DNA repair, and cell cycle regulation pathways are considered the most likely to be involved in the progression to HCC, the contribution of individual genes to this sequence is not conclusive. When SNPs curated from the NAFLD-related literature [3] were compared with those reviewed [10], common SNPs were identified in only four genes, from the oxidative stress response pathway (GSTM1 and SOD2), the inflammatory/immune response (TNF) and the previously mentioned PNPLA3 from lipid metabolism. Sample heterogeneity, however, was present, since variants associating with NAFLD were obtained from non-alcoholic patients whereas the HCC variants were obtained from studies comparing normal controls to hepatitis B or C (HBV or HCV)-infected patients or patients with a history of alcohol abuse.

In summary, the term NAFLD is used to define a spectrum of progressive liver diseases that can progress to HCC, and associates with T2DM, abdominal obesity, and cardiovascular diseases. From a public health perspective, a pressing need exists to address this looming disease burden by identifying disease markers or potential drug targets. Due to the complexity of interactions, the variety of potentially contributing factors, and absence of specific diagnostics and treatments, NAFLD is a prime example of a disease that might benefit from a systems medicine approach. The EU funded CASyM (Coordinating Action Systems Medicine) project (https://www.casym.eu/) offers a wider discussion forum of systems approaches in medicine and is open for interested parties to join.