Introduction
There are currently a number of genomics data-warehouses available which are powered by the InterMine1 platform. This set includes large curated services dedicated to the primary Model Organism database (MOD) communities as part of the InterMOD project2, the collected data sets of research projects such as the modENCODE project3, as well as a range of other resources including metabolicMine4, TargetMine5, FlyTFMine6, and MitoMiner7. In addition to being accessible through web-interfaces these resources also provide web-service access (to be described in a forthcoming paper).
The InterMine system provides users with a number of benefits. A typical InterMine instance, such as FlyMine8 or YeastMine9, contains feature annotations, protein data, publications, biochemical pathways, orthology, Gene Ontology (GO), array expression results, and other kinds of data, all integrated into a single knowledge graph. This means end users are able to ask questions across different data types. InterMine’s particular data integration strategy puts minimal limitations on the kinds of queries that can be performed: any arbitrary number of data-sets can be referred to in the same query (provided links exist between them) and a wide variety of logical contraints can be added. The InterMine platform thus provides a basis for very flexible, user-defined queries over linked data sets.
The BioJS10 project seeks to provide a suite of reusable JavaScript components that members of the bioinformatics community will find useful for producing analysis and visualisation tools. The InterMineTable BioJS component contributes towards this aim by adding a data query and exploration tool to the set of BioJS components which exposes the full flexibility and power of user defined queries over integrated linked data in a clear user interface.
Installation
As a visual BioJS component, the intended audience is web-developers aiming to provide extended functionality to web-based resources for life-scientists. It is expected to be deployed within modern browser environments with access to third-party resources. With this in mind, installation comprises of including the dependencies for the InterMine table component on the page (usually added in the head section of an HTML page), see Supplementary materials A. Once the dependency on the InterMine tables library is loaded, the InterMineTable BioJS component may be included (see code listing 1).
Listing 1. Loading the BioJS InterMine Table library
<script src="Biojs.InterMine.Table.js"></script>
This last resource contains the definition of the InterMine.Table BioJS component. As it is not available from a reliable third party source, it currently needs to be downloaded from the BioJS registry11, and hosted locally.
Usage
This component is used by instantiating the InterMine table component in a user-included JavaScript file, passing in the appropriate configuration for the desired source of data as well as the query over those data.
Once instantiated, the results of the query against the specified integrated data-warehouse are loaded into a component where they can be browsed and manipulated. This means that the two critical concepts for using this component are 1) the location of the data-store, defined as the uniform resource locator (URL) pointing at the root of a set of web-services, and 2) the query to be run on the data in the store, defined in a configuration object. For example, to load a table of data from FlyMine the user would want the URL to point to FlyMine’s webservices:
Listing 2. Specifying the Data-Store
var url = "http://www.flymine.org/query";
The query can be broadly defined as a list of fields, identified by paths from a root, constrained by a (possibly empty) set of filters. There are some refinements to this (such as sort-order, optional element definition, and constraint composition) for which more detailed documentation12 exists. The concept of a path is important to the idea of a graph of linked data, as it enables chains of relationships between entities to be followed, with minimal syntactic overhead. For example the chain of relationships the names of the protein domains of the proteins encoded by the genes belonging to a biochemical pathway can be referred to as Pathway.genes.proteins.proteinDomains.name.
A query is defined as a plain JavaScript object which can be simple, such as the following query, which requests the common name, scientific name and taxon ID for all organisms in the data-store:
Listing 3. A simple Query, see Figure 1
// Available organisms var query = { name: "An optional name", from: "Organism", select: ["commonName","name","taxonId"], };
or arbitrarily complex, such as the following query which combines information from multiple data sources (OMIM13, PANTHER14, Treefam15, KEGG16, Reactome17, FlyBase18) and across different organisms to find the Drosophila melanogaster genes in the pathways of genes which are orthologous to human genes implicated in Alzheimer’s disease:
Listing 4. A Complex Query, see Figure 2
var disease = { from: "Disease", select: [ "genes.homologues.homologue.pathways.genes.*" ], where: { "name": "Alzheimer*", "genes.organism.name": "Homo sapiens", "genes.homologues.homologue.organism.name": "Drosophila melanogaster" } };
An element also needs to be present on the page where the table should be loaded. This can be any element (although a DIV element is conventional), and should be uniquely identifiable (through its ID for instance).
Listing 5. Defining the Target Element
var target = "#table-container";
These values are then passed to the component constructor, which builds a new table in the page, and loads the relevant data from the configured service:
Listing 6. Instantiation
var table = new Biojs.InterMine.Table({ target: target, url: url, query: query });
Once instantiated, a table will be loaded into the page displaying rows of data as specified by the query (see Figure 1, Figure 2).
Interaction
The table, as well as providing a number of common dynamic features such as resorting, pagination and column rearrangement, also permits much deeper interaction than other comparable table libraries. The table allows the underlying query to be changed: the constraints of the underlying query can be edited (Figure 3); columns (including ones referring to data types not in the original query) can be added; existing columns can be removed; changes made can be undone; the data can be exported in a number of formats or sent to another application, such as a local Galaxy19–21 instance, or to a remote application such as GenomeSpace22 (Figure 4); the results can be saved as a resusable set (a list) within the originating service; individual items can also be previewed (Figure 5).
One particularly useful feature is the ability to view the contents of a single column, analysing it on aggregate and adding or editing filters. This facility is able to present summary charts for columns based on data type: binned histograms for numerical data (Figure 6), and column charts for categorical data (Figure 7, showing the user adding a filter by selecting items from the column).
Events
The standard mechanism for communication between components in JavaScript is event signalling. As per the BioJS specification, this component supports other objects registering event listeners so they may be notified when events of interest (such as user interactions) occur.
Once loaded, the table may emit a number of different events (as listed in the API documentation23), and may be manipulated by calling methods on the instance, allowing the calling page to respond to user interactions. For example, if a developer wished to receive notifications when the user clicks on any of the cells in the table, they can register to listen for these events:
Listing 7. Adding an Event Listener
table.addListener("imo:click",function (type, id) { alert("User clicked on " + type +" " + id); });
This integration means that the table need not be an isolated part of an application, but can be fully integrated with other components. For example, instead of just notifying the user by using alert, the information about this object could be displayed in another component. If the user clicked on a protein, this could be detected, and other suitable components could be instantiated to display protein-specific analysis (see code sample 8).
Listing 8. Integrating with Other Components - Example 1
table.addListener("imo:click",function (type, id) { if("Protein" === type) { table.service.findById(type, id) .then(loadProteinStructureDisplayer); } function loadProteinStructureDisplayer(protein) { // ... load other BioJS component here. } });
As well as responding to user interaction with the table, the table component exposes an API to change the state of the table by changing the query it represents. This allows communication in the other direction. For example if a linked component, such as a protein structure displayer, emits an event indicating the user has selected a given set of protein domains, the table could be modified by adding a filter for these domains to the current query (see code sample 9):
Listing 9. Integrating with Other Components - Example 2
/* Assuming a "displayer" component that emits the "domains:selected" event */ displayer.addListener("domains:selected", function (domains) { var currentQuery = table.getQuery(); currentQuery.addConstraint({ "Gene.proteins.proteinDomains.identifier": domains }); });
In this way, the interoperability of these components makes them of increasing utility to developers, as more of them are published and integrated into third party applications.
Comments on this article Comments (0)