ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Web Tool

BioJS InterMine List Analysis: A BioJS component for displaying graphical or statistical analysis of collections of items from InterMine endpoints

[version 1; peer review: 1 approved]
PUBLISHED 13 Feb 2014
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the BioJS collection.

Abstract

Summary: The InterMineTable component is a reusable JavaScript component as part of the BioJS project. It enables users to embed powerful table-based query facilities in their websites with access to genomic data-warehouses such as http://www.flymine.org, which allow users to perform flexible queries over a wide range of integrated data types.
Availability:  http://github.com/alexkalderimis/im-tables-biojs; http://github.com/biojs/biojs; http://dx.doi.org/10.5281/zenodo.8301.

Introduction

InterMine1 is a platform for building data warehouses which includes specialisations for the life-sciences. As part of the InterMOD2 project, a number of InterMine data-warehouses have been developed and released to the public containing high-quality integrated data curated by the major model organism database (MOD) organisations. In addition, the InterMine platform is widely used by other projects, such as the modENCODE project3, as well as a range of other resources including metabolicMine4, TargetMine5, FlyTFMine6, and MitoMiner7. This means that reliable integrated data sets exist for use by researchers working in a wide range of fields in the life-sciences, which can be accessed by a common interface.

One of the features of the InterMine system is the ability to store named sets of entities, called lists, and refer to them in queries and other analysis. This allows a user, for example, to save a list of genes and reuse this saved collection easily. The InterMine system also allows specialised analysis to be performed taking advantage of the integrated nature of the data warehouse system. For example the system can run queries that aggregate information about relationships between data types, and provide indications of levels of statistical significance for the results (enrichment queries).

Until recently, the output of these list analysis tools was only accessible through the web-application built into the InterMine system. Recent work on the InterMine web services has enabled this functionality to be externalised into the list-widgets8 project: separate JavaScript-based components that can be used in third party websites. These developments have already been incorporated into the standard InterMine web-application configuration, meaning that users of the tools described here have access to the same query and display mechanisms in their own sites that are available through the standard InterMine web-application.

InterMine supports the aims of the BioJS9 initiative to provide well-designed, robust website components to application developers in order to foster code reuse and minimise duplicated effort. This leads us to contribute to the BioJS project this set of components for running list analysis tools and displaying their output, so that they may be widely distributed, and interoperate with tools from other developers.

Installation

As a JavaScript web component, these tools are designed to be run within the JavaScript virtual machines provided by modern browsers, and render to HTML pages. Installation means indicating to the remote client (the user), which resources to load as dependencies, as well as where these are located. Typically this is done by adding references to these resources in the head section of a page through the use of script element (see code sample 1). Recent practice suggests loading these resources in at the end of the body improves page load time. The dependencies that must be loaded to use these tools are listed in Supplementary materials A.

The BioJS InterMine list analysis library needs to be downloaded from the BioJS registry10 and hosted in an accessible location.

Listing 1. Loading the list analysis tools library.

<script
    src="Biojs.InterMine.ListAnalysis.js"></script>

Usage

Once the BioJS component and its dependencies are loaded, the component itself may be instantiated, which creates a new list analysis displayer, inserts it into the document, and populates it with the appropriate data by calling to the InterMine web-services. This requires that an element exists within the document (see code listing 2) into which the component can be inserted.

Listing 2. The target document element

<div id="list-analysis-example"></div>

The JavaScript code to instantiate the component refers to this element as the target, and provides the other arguments required to specify which list we wish to analyse, the url of the service where that list is to be found, and which specific analysis tool we wish to run. The example below uses a list of genes encoding putative Drosophila melanogaster transcription factors made available as a public list at FlyMine11 and runs the pathway enrichment statistical analysis tool. The full list of available lists (which each user can extend by creating personal lists) and analysis tools can be accessed from the InterMine service being used.

Relationship enrichment

One category of tools is the enrichment tools, which run queries that attempt to find relationships that are statistically significant for the set of entities as a whole. For example, FlyMine11 contains both genes, loaded from sources such as FlyBase12, and biochemical pathways, loaded from sources such as KEGG13 and Reactome14. The pathways enrichment tool lists pathways of which genes in the list are members, ordered by the degree of significance for the list of genes as a whole.

For example, if one gene in a list is in a particular pathway, but none of the others are, it would be considered less significant than a pathway that all or most genes in a list belonged to. Similarly, the background probability that a particular relationship exists for an item is taken into account, meaning for example that finding a publication that lists many or even all genes for a organism, such as Clark 200715, would not be considered as significant as a publication that mentions fewer genes, but with most of them being in the list of interest.

The p-values used as measures of statistical significance are calculated by modelling the relationships as a hypergeometric distribution (as Rivals 200716 and Beissbarth 200417), which determines the probability that a relationship between two entities would be selected at random given the set of items to choose from. Let n be the number of items in the list, and N be the size of the reference population, and k be the number of items in the list which are involved in the given relationship (are mentioned in the publication, for example, or belong to a particular biochemical pathway), and M be the number of items in the reference population which share that same relationship. Then for each relationship

P=(kM)(nkNM)(nN)

The options made available for multiple test correction include the Bonferroni, Holm-Bonferroni, and Benjamini Hochberg18 algorithms.

The tools in this category are all prefixed with enrichment:, and can be loaded as follows:

Listing 3. Loading an enrichment list analysis tool.

var ListAnalysis =
    Biojs.InterMine.ListAnalysis; 
var analysis = new ListAnalysis({
  target: "list-analysis-example", 
  url: "http://www.flymine.org/query", 
  list: "PL FlyTF_putativeTFs", 
  tool: "enrichment:pathway_enrichment"
});

Once run, the component should be inserted into the document (see Figure 1). The component allows the user to adjust the parameters of the analysis, including the multiple test correction method used, the p-value threshold and the background population.

7b7d6b9a-71fd-453a-ae7d-b05bd21363ec_figure1.gif

Figure 1. A list analysis tool displaying the results of a statistical analysis query.

The component also allows the user to interact with the results in a number of ways, specifically: by clicking on an individual item that was matched; by clicking on a button to show a set of matches; and by clicking on a button to request that the selected items be saved to some location. All these actions cause the component to emit events, which can be listened for and handled by the host JavaScript application. For example, to alert a string such as Gene - FGBN0123 when a user clicks on the corresponding element, one might attach an event listener to capture the onClickMatch event, see code listing 4.

Listing 4. Listening for a click event.

analysis.onClickMatch(function (ident, type) {
  alert(type + " - " + ident);
});

This enables the behaviour of the component to be integrated into the hosting application. The full listing of events and their arguments is included in the BioJS API documentation19.

The canonical example for the use of statistical enrichment in bioinformatics is enrichment of Gene Ontology (GO) terms for sequence annotations (Rivals 200716). This functionality is supported as one of the statistical analysis tools (see Figure 2), within this more generic enrichment analysis framework. The GO enrichment tool merits some further notes, however, as it supports some of the more advanced parameters.

7b7d6b9a-71fd-453a-ae7d-b05bd21363ec_figure2.gif

Figure 2. A list analysis tool displaying the results of the Gene Ontology (GO) statistical analysis query.

The GO enrichment tool demonstrates the use of optional filter parameters to limit the results in some way. In the GO tool, it allows the user to select the sub-ontology they are interested in. The user can also choose to normalise the results of this tool, in this case by transcript length.

Charts

The other main category of analysis tools is the chart tools. These run aggregate queries over the items in a list, and present the information graphically in interactive charts. The InterMine system supports both numerical and categorical charting, reflected in the supported chart formats: bar charts, line charts, pie charts and scatterplots.

Loading a chart analysis tool is identical to loading a statistical enrichment tool - only the name of the tool need differ (see code listing 5).

Listing 5. Loading a chart list analysis tool.

var chart = new Biojs.InterMine.ListAnalysis({
  target: "list-analysis-example", 
  url: "http://www.flymine.org/query",
  list: "PL FlyTF_putativeTFs", 
  tool: "chart:flyfish"
});

This code will request data for the particular tool (flyfish), as run against the given input list (PL FlyTF_putativeTFs), and then display the results in the appropriate chart format (Figure 3). The chart tools have fewer parameters; they may take a single parameter, as detailed in the tool description available from the relevant service (e.g. http://www.flymine.org/query/service/widgets).

7b7d6b9a-71fd-453a-ae7d-b05bd21363ec_figure3.gif

Figure 3. A list analysis component displaying the results of a the chart:flyfish tool (loaded in Code Listing 5), which queries against Fly-FISH20 data.

In most cases they do not provide mechanisms for the user to change the results displayed. They do however provide several mechanisms for the user to interact with the results displayed. The user can click on the groupings or data-points represented on the chart (see Figure 4), which allows the user to trigger the same events available to enrichment tools, which can be captured the same way (see code listing 4).

7b7d6b9a-71fd-453a-ae7d-b05bd21363ec_figure4.gif

Figure 4. The result of a user clicking on the "stage 6–7, expressed" bar of the chart.

Discussion

This tool addresses an important set of needs for bioinformatics developers: the ability to perform enrichment analysis, and the the visualisation of typed relationships between entities. The InterMine platform, and this BioJS component make performing these analyses and displaying the output straightforward. It allows the developers to focus on integrating this functionality where it is needed, and users to focus on interpreting rather than retrieving the data. It is expected that wide availability of these tools will provide significant savings in time for typically stretched developers and researchers. By providing this functionality as a BioJS component, it is hoped that integration between different tools will result in the creation of applications that are able to integrate analysis and visualisation from different platforms.

Conclusions

It is hoped that this component will prove useful to those developing tools for researchers in the life-sciences. Significant work has gone into creating, curating and combining high quality data sets. The recent work in exposing these resources through web-services and producing reusable web-based components allows this investment to benefit not just visitors to sites based on InterMine applications, but any developer or user who aims to include this kind of statistical analysis and visualisation in their platform. By providing bioinformatics web-developers, and their users, with access to a broad range of data sources meeting the needs of many diverse research communities, we expect to help reduce the development burden on projects with limited resources, and help minimise redundancy of effort.

Software availability

Zenodo: BioJS InterMine List Analysis Widgets, doi: 10.5281/zenodo.830221.

GitHub: BioJS, http://github.com/biojs/biojs.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 13 Feb 2014
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Kalderimis A, Stepan R, Sullivan J et al. BioJS InterMine List Analysis: A BioJS component for displaying graphical or statistical analysis of collections of items from InterMine endpoints [version 1; peer review: 1 approved] F1000Research 2014, 3:45 (https://doi.org/10.12688/f1000research.3-45.v1)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 13 Feb 2014
Views
4
Cite
Reviewer Report 13 Oct 2014
Clemens Wrzodek, Roche Innovation Center Penzberg, Roche Diagnostics GmbH, Penzberg, Germany 
Approved
VIEWS 4
The Manuscript:
The Article is very clearly written and formatted. It strongly focuses on the end-users that want to use the published library and describes how to include it and what possibilities it offers. The examples shown in the manuscript are ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Wrzodek C. Reviewer Report For: BioJS InterMine List Analysis: A BioJS component for displaying graphical or statistical analysis of collections of items from InterMine endpoints [version 1; peer review: 1 approved]. F1000Research 2014, 3:45 (https://doi.org/10.5256/f1000research.3699.r5874)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 13 Feb 2014
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.