Skip to main content
Advertisement

< Back to Article

Genome-Wide Detection and Analysis of Multifunctional Genes

Fig 1

Schematic representation of the pipeline to identify multifunctional genes.

We define as multifunctional all genes that have two or more annotations by distinct terms of comparable specificity. (A) First, we extract a subset of Gene Ontology terms at a comparable level of specificity. For a specificity threshold N, we select all terms that annotate at least N, but fewer than 2N genes, whose every descendant term (if any) annotates fewer than N genes. For example, if N = 90, then terms A and F are selected because each of them annotates more than 90 genes and less than 180 genes, and each of their descendant terms annotates less than 90 genes. In contrast, term E is rejected, because its descendant term F annotates more than 90 genes. Term H is also rejected, because it annotates more than 180 genes. (B) Once terms at a certain specificity level have been selected, we extract all genes annotated with at least two such terms. In order to consider annotations by distinct terms only, from the collection of all pairs of terms selected at the chosen level of specificity, we filter out those that either share a common ancestor (other than the root) or have a common descendant term in the GO graph. Further, we remove all pairs of terms that co-annotate more genes than expected by chance, as measured by the hypergeometric test. All genes co-annotated by some pair of terms (chosen at any considered level of specificity) passing these two filters are considered multifunctional.

Fig 1

doi: https://doi.org/10.1371/journal.pcbi.1004467.g001