Research Article
BIGO: A web application to analyse gene enrichment analysis results,

https://doi.org/10.1016/j.compbiolchem.2018.06.006Get rights and content

Abstract

Background and objective

Gene enrichment tools enable the analysis of the relationships between genes with biological annotations stored in biological databases. The results obtained by these tools are usually difficult to analyse. Therefore, researchers require new tools with friendly user interfaces available on all types of devices and new methods to make the analysis of the results easier.

Methods

In this work, we present the BIGO Web tool. BIGO is a friendly Web tool to perform enrichment analyses of a collection of gene sets. On the basis of the obtained enrichment analysis results, BIGO combines the biological terms to organize them and graphically represents the relationships between gene sets to make the interpretations of the results easier.

Results

BIGO offers useful services that provide the opportunity to focus on a concrete subset of results by discarding too general biological terms or to obtain useful knowledge by means of the visual analysis of the functional connections between the sets of genes being analysed.

Conclusions

BIGO is a web tool with a novel and modern design that provides the possibility to improve the analysis tasks applied to gene enrichment results.

Introduction

Enrichment analysis techniques enable the evaluation of the functional properties of a certain group of genes using previous biological knowledge. These techniques measure the relationships between genes from a certain group and the annotated biological terms stored in a database, such as Gene Ontology (Ashburner et al., 2000). In the literature, statistical measures, including the p-value (Rempher and Urquico, 2007), are used to quantify the significance of these relationships by means of different statistical models (Maciejewski, 2014). Moreover, different correction techniques (Draghici, 2003) can be applied to the p-values to reduce the effects of randomness in its calculation.

To enable researchers to perform such analyses, more than 60 gene enrichment analysis tools have been developed in the past few years (Dougu and Seon Young, 2008, Huang et al., 2009a, Khatri and Drăghici, 2005, Khatri et al., 2012). According to the different functionalities that these tools provide to users, the following services are among those that have been improved the most by researchers in recent works: user interface, visualization of enrichment results, comparison of results, gene ID conversion and filtering options. These services will be addressed in the following paragraphs.

Most of these tools are web applications: however, to a large extent, these web tools lack attractive, user-friendly, structured and optimized user interfaces. In other cases, they do not follow responsive design approaches, so they are not supported by mobile devices. Nevertheless, a new generation of tools has emerged, including WebGIVI (Sun et al., 2017), GS2D (Fontaine and Andrade-Navarro, 2016) and UBiT2 (Fan et al., 2017). UBiT2 is good example of an attractive, user-friendly and responsive web tool. Moreover, it has video tutorials, real demonstrations and a user manual. This tool provides visualization of the results as well as the analysis of RNA-Seq and qRT-PCR data.

The visualization of the analysis results is a relevant factor in knowledge extraction. Table 1 shows the different results viewing formats (i.e., as a table, in a directed acyclic graph (DAG), as a network, using scatter or chart plots) from an example set of enrichment tools (Enrichr (Kuleshov et al., 2016), g:Profiler (Reimand et al., 2016), UBiT2, WebGestalt (Wang et al., 2017) and agriGO (Tian et al., 2017)). Networks usually represent biological terms as an ontology, providing useful biological knowledge, whereas interactive DAGs only provide an animated graphical representation of the enrichment results.

The ability to compare multiple sets of results extracted from the same experiment could be useful to understand different analysis possibilities and to extract improved conclusions. Up to now, gene enrichment tools, such as Enrichr, have allowed users to save and/or download their results but do not allow to store and compare multiple sets of results from the same test in a single work session.

Gene ID conversion is another relevant factor to consider. Some tools, such as DAVID (Huang et al., 2009a, Huang et al., 2009b), GraphWeb (Reimand et al., 2008) and g:Profiler, allow the mapping of gene IDs to different formats.

Furthermore, due to the large amount of data generated during a gene enrichment analysis task, a filtering capacity is a useful tool to summarize the results. In general, two filtering options have been implemented in the developed tools. On one hand, filtering is performed before conducting the gene enrichment analysis. For example, Gorilla (Eden et al., 2009) enables analyses based on rules specified by, for instance, a p-value threshold; WebGestalt uses a significance level, number of permutations and minimum and maximum number of genes for a category, among others filters; and agriGO applies filters based on statistical tests and multi-test adjustment methods as well as the significance level. On the other hand, some tools apply filters to the gene enrichment results: WebGIVI enables the deletion of biological terms and agriGO provides the ability to select concrete biological terms and to download results or generate graphs based on them.

Despite these efforts, as the groups of genes become larger and more complex and the database sources include more and more data, the number of results and the relationships between groups will increase (Merico et al., 2010a). Therefore, there is not only a need to reduce the number of final results or to highlight those results that are really interesting for a specific research (Ultsch and Lötsch, 2014, Merico et al., 2010b), but also, to take advantage of the complex relationships between genes, either intra-group or inter-group. In this sense, this complexity could be turned into knowledge by applying data analysis techniques to express these relations in simpler and more comprehensible manner. For example, in (Kozielski and Gruca, 2014), the authors present an approach to identify clusters of genes represented both by expression values and GO annotations without conflict with either of the two representations, leading to further analysis and interesting conclusions. In addition, the tool Categorizer (Na et al., 2014), which uses semantic similarity measures for GO terms, can create groups of genes and identify the biological best fit category for each gene, enabling researchers to focus on processes of specific interest.

As it was described above, the commented features are important for these tools. Therefore, new proposals should not only take them into account, but also include new functionalities that significantly improve the current proposals. To this aim, a novel gene enrichment analysis tool, called BIGO, is presented in this paper. BIGO is a web-based tool for enrichment analysis, which includes the main characteristics of this kind of tools, along with other strengths that represent an improvement over the state of the art. So, the BIGO's features could be summarized as: (a) it is a modern web tool with a full responsive interface designed for guiding the users through the analysis process in a friendly way. (b) BIGO provides two different ways of visualizing the results: by means of tables and interactive networks, as it will be explained in later sections. As it can be shown in Table 1, only WebGestalt (Wang et al., 2017) includes the network visualization mode. (c) BIGO also allows the Gene ID conversion regardless of the format and it is able to apply different filtering actions during the analysis process, as it will be explained later. (d) BIGO can store different ranking results from a single work session and it allows to compare them, which is a new feature not seen before in the literature. Moreover, apart from the features described above, the added value of BIGO are the new functionalities that it includes. First of all, to the best of our knowledge, BIGO is the first tool designed to be applied throughout an entire experiment, which usually involves several groups of genes. That is, the rest of tools are able to analyze only one group of genes a time. BIGO not only generates enrichment results for every one of several groups of genes but it can also help to extract useful knowledge from the combination of these results. To do this, BIGO organizes the biological terms obtained from all the groups of genes as a ranking, providing different criteria to help researchers to focus on a specific portion of the enrichment analysis results. In addition, BIGO is able to infer hidden relationships between the groups of genes under analysis and to represent these relations as an interactive network. With these new features, BIGO helps researchers to face the larger and more complex sources of data, highlighting those results that are really interesting and extracting relations inter-groups that can lead to new conclusions.

Section snippets

Computational methods

The main purpose of this work is the implementation of a gene enrichment web tool that satisfies two main objectives: first, to help researchers to validate groups of genes using enrichment analysis results based on previous biological knowledge; second, to improve the subsequent analysis of these results.

The BIGO web application has been iteratively designed to take advantage of the expertise of researchers from biological and computational areas who work with groups of genes generated by

System description

In this section, the software specifications of the BIGO web tool are described. Besides, with the aim of complementing this section, an execution time experimentation has been included in the first section of the Supplementary material.

BIGO's software architecture is divided into three layers, following the concept and objectives of the model-view-controller (MVC) design pattern (Kim and Choi, 2016), that is, separating the business logic and data from the user interface and the controller

Sample of program runs

The main goal of this section is to demonstrate the relevance of BIGO as an enrichment analysis tool. In this sense, a concrete experiment with BIGO is conducted firstly. Secondly, the same experiment is carried out using two additional enrichment tools in order to compare the results obtained.

Conclusions

Gene enrichment analysis tools are currently facing new challenges due to the increasing quantity of available data. Since the database sources include more and more data, it has become more difficult to conduct gene expression data analysis because the number of results obtained has increased and the relationships between genes and groups have become more complex. Moreover, PCs are not the only hardware devices that researchers use to perform gene analysis: the use of mobile phones or tablets

Conflict of interest statement

The authors have no conflicts of interest to be declared.

References (39)

  • N. Dougu et al.

    Gene-set approach for expression pattern analysis

    Brief. Bioinform.

    (2008)
  • S. Draghici

    Data Analysis Tools for DNA Microarrays

    (2003)
  • E. Eden et al.

    GOrilla: a tool for discovery and visualization of enriched go terms in ranked gene lists

    BMC Bioinform.

    (2009)
  • L. Elia et al.

    Role of the ABC transporter Ste6 in cell fusion during yeast conjugation

    J Cell Biol

    (1996)
  • J. Fontaine et al.

    Gene set to diseases (GS2D): disease enrichment analysis on human gene sets with literature data

    Genomics Comput. Biol.

    (2016)
  • M. Franz et al.

    Cytoscape.js: a graph theory library for visualisation and analysis

    Bioinformatics

    (2016)
  • F. Gómez-Vela et al.

    Gene network biological validity based on gene-gene interaction relevance

    Sci. World J.

    (2014)
  • J. Holmes

    Struts: The Complete Reference (Complete Reference Series)

    (2006)
  • D. Huang et al.

    Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists

    Nucleic Acids Res.

    (2009)
  • BIGO tool is available in http://bigo.cica.es.

    Open-source code available at https://github.com/aureliolfdez/bigo.

    View full text