Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Integrative Data Mining Highlights Candidate Genes for Monogenic Myopathies

Figure 1

Integrated data mining workflow.

A signature of a disease group, composed of weighted terms, is generated from statistical analyses of genes already implicated in diseases of the group. Terms come from the three main annotation groups, GO (Gene Ontology), PO (Phenotype Ontology, an aggregate of Human Phenotype Ontology and Mammalian Phenotype Ontology) and IA (Interactions Annotation), are mined using Manteia and receive weights proportional to the their enrichment in the set of genes implicated in the disease group, as compared to the set of all genes in the human genome. Weights are attributed to terms so that annotation groups contribute equally to the composition of the signature. The signature of the disease group is then used to mine the genome for additional genes. Every gene in the genome receives a score equal to the sum of weights of terms that describe the gene if they match terms that define the disease group signature, for a maximum possible score of 3000. Further filtering steps mark genes that have low relative skeletal muscle expression or are annotated with known diseases.

Figure 1

doi: https://doi.org/10.1371/journal.pone.0110888.g001