Trends in Genetics
Volume 19, Issue 5, May 2003, Pages 238-242
Journal home page for Trends in Genetics

Genome Analysis
Predicting gene function by conserved co-expression

https://doi.org/10.1016/S0168-9525(03)00056-8Get rights and content

Abstract

We show that gene co-expression, which generally provides only a very weak signal for the prediction of functional interactions, can provide a reliable signal by exploiting evolutionary conservation. The encoded proteins of conserved co-expressed gene pairs are highly likely to be part of the same pathway not only after speciation (98%), but also after parallel gene duplication (97%). Conserved co-expression combined with homology data enables us to predict specific gene functions. The use of conservation between parallel duplicated gene pairs to predict function is especially promising given that gene duplication is common in eukaryotes, and that data from only a single organism can be used.

Section snippets

Co-expression provides a weak signal for pathway prediction

Two large-scale expression datasets were obtained, one from S. cerevisiae [2] and one from C. elegans [3]. Uncentered correlation [1] was calculated between the expression profiles of all S. cerevisiae genes and between the expression profiles of all C. elegans genes. The higher the correlation (R) between two genes, the more probable it is that they act in the same pathway (Fig. 1). However, at a significant correlation threshold of 0.6 (P<0.005, Table 1), the fraction of annotated proteins

Significant levels of evolutionary conservation of co-expression

To evaluate whether evolutionary conservation (Fig. 2) can improve upon these limits in the use of co-expression for function prediction, we first established whether there is significant conservation, potentially reflecting selection pressure on maintaining functional interactions. To determine conservation between S. cerevisiae and C. elegans, we first need to define which genes are orthologs of each other, which we do based on phylogenetic trees allowing for multi to multi orthology

Conserved co-expression improves accuracy of pathway prediction

Does the conservation of co-expression after gene duplication or speciation increase the likelihood of a functional relationship between co-expressed genes? Conservation after duplication in S. cerevisiae does indeed increase the accuracy levels for prediction of functional interactions, albeit at the expense of coverage of known interactions (Fig. 1). The results for C. elegans are similar, but there are not enough genes annotated in the PATHWAY database to establish the accuracy for conserved

New predictions from old data

Co-expression conserved between S. cerevisiae and C. elegans of the hypothetical gene CAT5 (YOR125C, ZC395.2) and COQ2 (YNR041C, F57B9.4) confirms earlier predictions based on knock-out experiments [11] and homology relations [12] that CAT5 is 2-polyprenyl-3-methyl-6-methoxy-1,4-benzoquinone mono-oxygenase, which is involved in ubiquinone synthesis, as COQ2 encodes para-hydroxybenzoate: polyprenyl transferase, which is also involved in ubiquinone synthesis.

A prediction based on conservation of

Modularity in pathway evolution

Of particular evolutionary importance is our finding of a substantial number of cases where, although the expression pattern of A′ and B′ has changed relative to their ancestors A and B, the co-expression of A′ and B′ is conserved. This seemingly contradicts the finding by Wagner that after duplication events, mRNA expression patterns diverge very quickly relative to amino acid sequence [21]. Yet, both results complement each other as we show that the co-expression is often conserved even when

Outlook

Correlations between expression profiles do not necessarily imply co-regulation, and co-regulation does not always indicate functional interaction. Thus, it is important for function prediction to increase the reliability of co-expression data. Overlapping transcriptional clusters from different clustering methods have led to the prediction of functional categories for many genes [5]. Here we show that both intraspecies and interspecies conservation make expression data useful for the reliable

Acknowledgements

This work was supported in part by a grant from the Netherlands Organization for Scientific Research (NWO).

References (31)

  • T.F. Smith et al.

    Identification of common molecular subsequences

    J. Mol. Biol.

    (1981)
  • W.R. Pearson

    Empirical statistical estimates for sequence similarity searches

    J. Mol. Biol.

    (1998)
  • M.B. Eisen

    Cluster analysis and display of genome-wide expression patterns

    Proc. Natl. Acad. Sci. U. S. A.

    (1998)
  • S.K. Kim

    A gene expression map for Caenorhabditis elegans

    Science

    (2001)
  • L.F. Wu

    Large-scale prediction of Saccharomyces cerevisiae gene function using overlapping transcriptional clusters

    Nat. Genet.

    (2002)
  • Cited by (0)

    View full text