Research Article
Prognostic prediction of carcinoma by a differential-regulatory-network-embedded deep neural network

https://doi.org/10.1016/j.compbiolchem.2020.107317Get rights and content

Abstract

The accurate prognostic prediction is essential for precise diagnosis and treatment of carcinoma. In addition to clinical survival prediction method, many computational methods based on transcriptomic data have been proposed to build the prediction models and study the prognosis of cancer patients. We propose a differential-regulatory-network-embedded deep neural network (DRE-DNN) method by integrating differential regulatory analysis based on gene co-expression network and deep neural network (DNN) method. From three public hepatocellular carcinoma (HCC) datasets, we derive differential regulatory network and embed regulatory information into DNN. By employing 1869 differential regulatory genes and survival data, we apply DRE-DNN to build a prediction model. We compare our method with the one which has all gene features in normal DNN, and results show that our method has better generalization ability and accuracy. We modify the normal DNN and develop an efficient method to predict prognosis of HCC from gene expression data. Our method decreases the inconsistence caused by the overfitting problem when the training sample size is small. DRE-DNN is also extendable for prognostic prediction of other cancers.

Section snippets

Background

Carcinoma is one of the common diseases that cause death in humans. Cancer occurrence and development is usually a multi-factor, multi-step complex process (Hanahan and Weinberg, 2011; Greaves and Maley, 2012). According to the latest global cancer statistics in 2018, there are an estimated 18.19 million new cancer cases and 9.6 million cancer deaths worldwide (Bray et al., 2018). Therefore, accurate prediction of the survival time is essential for precise cancer diagnosis and treatment, which

Data sets

Three public HCC data sets from GSE10143 (Yang et al., 2013), GSE14520 (Hoshida et al., 2008) and TCGA (Roessler et al., 2012) are used in our study as shown as Table 1 shows each set of data consists of a gene expression profile and contain clinical data for disease state measurements.

GSE10143 has 80 cancer patient tissue samples and 82 normal tissue samples, containing 6100 characteristic genes. GSE14520 has 221 cancer patients tissue samples and 210 normal tissue samples, containing 13,050

DRE-DNN training results

We apply the proposed DRE-DNN and norml DNN model to these three datasets from GEO and TCGA databases.

In the training set, we performed 10 verification experiments. Area Under the Curve (AUC) value can objectively reflect the ability of the model to comprehensively predict positive and negative samples, and consider the impact of unbalanced data. The greater the AUC value, the stronger the ability to accurately predict. Table 2 shows that the average AUC values for prediction results of the

Conclusions

Deep learning method is very promising and non-trivial in bioinformatics studies. Embedding valuable genetic information into DNN to predict prognosis of cancer helps to prevent the problem of overfitting when dealing with the high-dimensional transcriptomic data. DRE-DNN integrates differential regulatory analysis based on gene co-expression network and DNN method, which makes a better survival prediction for each cancer data set.

In this work, we embed regulatory information into DNN and

Additional files

All additional files are available at: https://github.com/biohitszcs2019/DREDNN

Authors’ contributions

JL and YP designed the study, performed bioinformatics analysis and drafted the manuscript. All of the authors performed the analysis and participated in the revision of the manuscript. JL and YW conceived of the study, participated in its design and coordination and drafted the manuscript. All authors read and approved the final manuscript.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This work was supported by the grants from the National “863” Key Basic Research Development Program (2014AA021505), the National Key Research Program (2017YFC1201201) and the startup grant of Harbin Institute of Technology (Shenzhen).

References (33)

  • M. Greaves et al.

    Clonal evolution in cancer

    Nature

    (2012)
  • H. Han et al.

    The coming era of artificial intelligence in biological data science

    BMC Bioinformatics

    (2019)
  • Y. Hoshida et al.

    Gene expression in fixed tissues and outcome in hepatocellular carcinoma

    New England J. Med. Surg. Collat. Branches Sci.

    (2008)
  • A. Kassambara et al.

    Package ‘survminer’

    (2017)
  • S. Kim et al.

    Network-based penalized regression with application to genomic data

    Biometrics

    (2013)
  • D. Kingma et al.

    Adam: a method for stochastic optimization

    Comput. Sci.

    (2014)
  • Cited by (9)

    View all citing articles on Scopus
    1

    These authors contributed equally to this work.

    View full text