Ontologies in Metabolomics

Researchers are also using ontologies as curated knowledge to provide guidance for their studies. In studying LSD1’s contribution to carcinogenesis through chromatin regulation, the authors of [19] used categories of biological processes specified in GO in performing signal pathways analysis. Terms from the PO have also been used for pathway analysis, in the study of heretosis in Arabidopsis thaliana [20].

Researchers are also using ontologies as curated knowledge to provide guidance for their studies. In studying LSD1's contribution to carcinogenesis through chromatin regulation, the authors of [19] used categories of biological processes specified in GO in performing signal pathways analysis. Terms from the PO have also been used for pathway analysis, in the study of heretosis in Arabidopsis thaliana [20].
In the future, as adoption of ontologies grows, the advantages to the research community will also continue to grow. For example, with OBI, research studies may be modeled directly on ontology (see Example Use 3 in [11]). As this becomes prevalent, we will be able to organize and find studies with specific sets of properties very quickly. This will allow us to easily identify studies which use the best practices and techniques, separating them from lesser work.
The robust ecosystem of ontologies in biomedicine already may be used to help researchers in conducting their individual research, but when combined with other technologies may be used as the basis of vast knowledge bases derived from the work of the entire community. Using natural language processing based techniques (e.g., [21]); one can understand the content of articles and build ontology-based indexes. These indexes might reveal new research questions. For example, should studies on several subtypes of a metabolite reveal a certain property, and then it may be the case that the parent metabolite has this property as well. Reasoning systems such as those based on description logics or other subsumption or hybrid reasoners (e.g., [22,23]) may be able to identify these new research questions automatically. As

Introduction
Metabolomics studies the structure, function and relationships between biological and chemical entities. As we move toward systems biology we need to be certain that we are representing this knowledge consistently between studies and between laboratories. The application of ontology to metabolomics can improve the consistency of study data and can help link data using relationships that extend the computational capacity of the study data and enrich that knowledge source with a myriad of nationally available data to help fuel hypothesis driven laboratory based research.
Ontologies have been successfully used to map databases to each other so that they may be used more effectively. There are several databases used in metabolomics research, including general databases such as KEGG LIGAND [1,2], which contains information about chemical compounds, reactions, and enzymes relevant to life, and MetaCyc [3], a database of metabolic pathways; and organism-specific databases such as those for E. coli : the E. coli metabolome database (ECMDB, [4]) and EcoCyc [5]. Some of these databases (namely, MetaCyc and EcoCyc) also have ontological components.
Ontologies also bring formal structure to domain terminologies, adding confidence in their utility. At the core of this standardization is a hierarchically organized controlled vocabulary. The vocabulary is derived from the prevailing term usage within the domain, and is generally mediated among, and agreed upon by, the field's practitioners. The organization into a hierarchy allows for subtype-super type reasoning (i.e., subsumption), which is important for recognizing trends within classes of entities. Ontologies contain other relations among terms, representing scientifically vetted domain knowledge which may be used to understand what typifies terms and differentiates them from others. This standardization eliminates duplication of knowledge (as is often found in resources such as PubChem [6]), and if experiments are modeled on ontologies, leads to reproducible research.
Ontologies have had an extremely positive effect on several subfields within biology and biomedicine, partly since many of the most common ontologies evolve in tandem as part of the OBO foundry [7]. The ontologies most relevant to the field of metabolomics are all from the OBO foundry: the Chemical Entities of Biological Interest ontology (ChEBI), [8], the Gene Ontology (GO), [9], the Plant Ontology (PO), [10], and the Ontology for Biomedical Investigations (OBI) [11].
Within metabolomics, these ontologies are being used with greater frequency and effectiveness. Perhaps the most obvious place to apply ontologies is in tools and data we use with our tools. The BiNChE web tool uses ChEBI for enrichment analysis of raw metabolomics data, in order to allow scientists to better understand and sort through those data [12,13]. Enrichment analysis is also done using the PO in studies of plant metabolomics [14]. OntoMaton [15] is a tool allowing users to mark-up Google Spreadsheets documents with ontological data, and is used in the MetaboLights [16] database of metabolomics studies. The mzTab data format is used to communicate mass spectrometry data to a wider audience, through being usable in common tools like Excel and R. It uses ontological terms from GO and others to annotate terms for easy computational interaction with other data [17].