BD-Func: a streamlined algorithm for predicting activation and inhibition of pathways

BD-Func (BiDirectional FUNCtional enrichment) is an algorithm that calculates functional enrichment by comparing lists of pre-defined genes that are known to be activated versus inhibited in a pathway or by a regulatory molecule. This paper shows that BD-Func can correctly predict cell line alternations and patient characteristics with accuracy comparable to popular algorithms, with a significantly faster run-time. BD-Func can compare scores for individual samples across multiple groups as well as provide predictive statistics and receiver operating characteristic (ROC) plots to quantify the accuracy of the signature associated with a binary phenotypic variable. BD-Func facilitates collaboration and reproducibility by encouraging users to share novel molecular signatures in the BD-Func discussion group, which is where the novel progesterone receptor and LBH589 signatures from this paper can be found. The novel LBH589 signature presented in this paper also serves as a case study showing how a custom signature using cell line data can accurately predict activity in vivo. This software is available to download at https://sourceforge.net/projects/bdfunc/.


INTRODUCTION
Systems-level analysis of the combined expression pattern of multiple genes can be more informative than the expression pattern of an individual gene, and there are a number of tools to calculate functional enrichment of differentially expressed genes (Huang, Sherman & Lempicki, 2009;Naeem et al., 2012;Nam & Kim, 2008). However, many functional annotations merely list membership in a pathway or ontology without explicitly modelling genes that should show activation or inhibition. For example, consider the KEGG canonical Wnt signalling pathway ( Fig. 1) (Kanehisa & Goto, 2000). This gene list includes molecules that both activate and inhibit the pathway, resulting in different phenotypes (Dellinger et al., 2012;Logan & Nusse, 2004). However, many functional enrichment tools would expect all the members of the pathway to behave similarly (Fig. 1C), such that up-regulation of a mix of activators and inhibitors can receive a higher score than selective up-regulation of only activators within the pathway. For example, the most standard  KEGG (Kanehisa & Goto, 2000). Up-regulated genes are shown in red and down-regulated genes are shown in green. (A) Complete Activation: All genes activating the Wnt signalling pathway are up-regulated and all inhibitors are down-regulated. (B) Complete Inhibition: All genes activating the Wnt signalling pathway are down-regulated and all inhibitors are up-regulated. (C) Mixed Pattern: All genes in the figure are up-regulated. It is unclear what the downstream expression levels should be, but one may hypothesize a mixed result from Figs. 1A and 1B. However, most functional enrichment tools would predict this as the pattern with the strongest up-regulation. Underneath each diagram is the expected signal distribution that would be produced by BD-Func. method for functional enrichment is to calculate over-representation of one gene list in another gene list, possibly using a Fisher's exact test or hypergeometric test; in the example described above, this sort of statistical test would ask if a list of differentially expressed genes shows a higher number of Wnt signalling genes than expected by chance. This sort of test cannot differentiate the behavior of those genes unless more detailed gene lists are defined (such as Wnt-inhibitors versus Wnt-agonists). This is a basic problem that BD-Func (BiDirectional FUNCtional enrichment) is designed to overcome.
Most functional enrichment tools either require upstream filtering of gene lists (FuncAssociate (Berriz et al., 2009), GATHER (Chang & Nevins, 2006, DAVID (Huang, Sherman & Lempicki, 2008), Connectivity Map (Lamb et al., 2006, WebGestalt (Zhang, Kirov & Snoddy, 2005), GoMiner (Zeeberg et al., 2003), ErmineJ (Lee et al., 2005)) and/or a comparison of signal intensities between two groups (T-profiler (Boorsma et al., 2005), GSVA (Hanzelmann, Castelo & Guinney, 2013), PAGE (Kim & Volsky, 2005), GSEA (Subramanian et al., 2005), ErmineJ (Lee et al., 2005)). However, BD-Func compares the relative expression levels between activated and inhibited genes, and we show that BD-Func can successfully analyze either fold-change values between populations or raw intensity/expression values (for both microarray and RNA-Seq data). Signalling Pathway Impact Analysis (Draghici et al., 2007;Tarca et al., 2009) can be used to model activation and inhibition within a graph, but that algorithm primarily focuses on network topology (which is not always known); in contrast, BD-Func uses a simpler assumption of binning genes into two categories (activation or inhibition). Additionally, the ability to analyze absolute expression values in a single sample is a unique feature only present in a limited number of functional genomic tools. For example, ASSESS (Edelman et al., 2006) can theoretically predict functional enrichment in a single sample, but that score is not taken into context amongst other samples: in other words, BD-Func uniquely uses a single sample enrichment score as a classifier, which may be useful for personalized medicine research or for the development of novel molecular signatures where the user may need to quantify the utility of the signature as a classifier. Additionally, correlations between a single sample and various samples within a database (such as SPIED (Williams, 2012)) can provide information about a single sample, but this requires having a database of samples for comparison (so, this strategy will not work without the presence of an external database). Finally, BD-Func users are encouraged to share their lists of activated and inhibited genes in a simple file format. This allows easy application of models that may not be in an existing database for molecular signatures.
This study tests the accuracy of BD-Func on datasets that were used to define gene sets in MSigDB (Molecular Signatures DataBase, Subramanian et al., 2005), in comparison to GSEA (Gene Set Enrichment Analysis, Mootha et al., 2003;Subramanian et al., 2005) and IPA (Ingenuity Pathway Analysis, Ingenuity ® Systems, www.ingenuity.com). IPA was selected for comparison because the upstream regulator function utilizes a similar principle as BD-Func (activation and inhibition is predicted by comparing the proportion of activated or inhibited targets, based upon annotations in the proprietary IPA database). GSEA was selected for comparison for two reasons: (1) GSEA was specifically designed to analyze MSigDB signatures, thus serving as a good baseline for positive control datasets and (2) MSigDB contains some signatures for both up-and down-regulated genes, so it is useful to compare separate analysis of these signatures (using GSEA) versus a direct comparison of up-regulated to down-regulated gene expression (using BD-Func). Different models for TGFβ and progesterone receptor (PGR) activity are also tested for robustness by application to other datasets. Finally, the utility of BD-Func to study custom gene signatures is tested with a novel signature associated with progesterone receptor status in breast cancer patients as well as a novel LBH589 signature that was defined using previously published cell line data and is validated using novel in vivo data presented in this study.

BD-Func algorithm
The basic principle behind BD-Func is to treat activated and inhibited genes as replicate observations in two populations. BD-Func is agnostic towards the type of signal used for analysis. For example, this paper uses BD-Func to study both fold-change values between two populations as well as raw signal intensities for a single column of signal values. In the paper, the t-test statistic is used to compare the expression patterns of activated and inhibited genes. However, BD-Func also allows users to compare the activated and inhibited distributions using a Mann-Whitney U test or Kolmogorov-Smirnov (K-S) test. BD-Func users have the option to calculate False Discovery Rates (FDR) using the method of Benjamini & Hochberg (1995) or the Storey q-value (Storey & Tibshirani, 2003).
BD-Func comes with four enrichment files: c2, c5, and c6 from MSigDB (Liberzon et al., 2011) and a list of functions defined directly from the human Gene Ontology (GO) annotations (Ashburner et al., 2000). All of these lists were created by searching for functions with both "up" and "down" (or "positive regulation" and "negative regulation") gene lists. For the human GO file, a functional annotation needed to contain at least 10 positively regulated genes and 10 negatively regulated genes. We also encourage users to share their own custom models on the BD-Func discussion group: http://sourceforge.net/ p/bdfunc/discussion/.
There are three different input files that can be analyzed using BD-Func ( Fig. S1): 1-D Input File: If the user supplies expression values for a single column of data, a density plot is created for the signal for the activated and inhibited genes. In this case, the BD-Func algorithm works exactly as described above.
2-D Input File: If the input file contains multiple columns of data, BD-Func is first run separately for each sample (represented by a column in the data matrix), as described above. Next, box-plots are created for test-statistic scores for each group (labelled in the header of the input file; Fig. S2). Finally, an ANOVA p-value is provided to compare the test statistics between groups. Theoretically, test statistics could be used to make functional predictions for each sample in isolation. There are some examples in this paper where this strategy works OK. However, comparing test statistics across all samples within different groups is the only strategy that consistently works well for all the analysis presented in this paper.
2-D Input File for Classifier: If the input file contains multiple columns of data and two groups (called "positive" and "negative"), BD-Func will create a receiver operating characteristic (ROC) plot using the test-statistic as the classification score using the ROCR package (Sing et al., 2005). This is in addition to all the calculations and output files for a normal 2-D input file containing multiple columns for BD-Func to analyze (in this paper, this represents normalized signal intensity values).
All other clinical samples (Anders et al., 2008;Bild et al., 2006;Chin et al., 2006;Finak et al., 2008;Huang et al., 2003;Ivshina et al., 2006;Sotiriou et al., 2003;The Cancer Genome Atlas Network, 2012) were downloaded and analyzed for differential expression using BRAVO (http://bravo.coh.org/) (X Deng, C Warden, Z Liu, I Zhang, Y-C Yuan, unpublished data). The novel progesterone receptor gene signature presented in this paper was produced by identifying genes in the expO dataset (GEO Series GSE2109) with a |fold-change| > 2 and an False Discovery Rate (FDR) < 0.05 (where the FDR is calculated using the method of Benjamini & Hochberg (1995) to analyze the distribution of t-test p-values). This is how the samples in this particular paper were processed, but users are not required to use this particular set of tools for preparing BD-Func input files and/or creating gene lists for custom signatures.
Reads Per Kilobase per Million mapped reads (RPKM (Mortazavi et al., 2008)) values for RNA-Seq data was downloaded from the TCGA web portal (The Cancer Genome Atlas Network, 2012). RPKM values were transformed by addition of 0.1 (to avoid large fold-change values for low coverage reads) followed by a log 2 transformation (to normalize the signal distribution).

IPA comparison
Ingenuity Pathway Analysis (IPA; Ingenuity ® Systems, www.ingenuity.com) contains an "Upstream Regulator" module that compares the enrichment of activated and inhibited genes among up-and down-regulated genes. So, the underlying principle is similar to BD-Func except it utilizes Ingenuity's propriety database of regulatory interactions and uses a z-score to calculate statistical significance between activated and inhibited genes. Gene lists in IPA were filtered for those genes showing |fold-change| > 1.5 while the entire gene list is used to define background enrichment. For recovery of known perturbations of oncogenic regulators, the upstream regulator must be identified as "activated" or "inhibited" (|z-score| > 2), which are the default cut-offs.

LBH589 signature
Activated and inhibited genes were defined using overlapping gene lists from 3 cell line treatments that have been previously published (Kubo et al., 2013). That same study showed that LBH589 treatment significantly decreased tumor volume in exemestane (EXE) resistant MCF-7aro xenografts in mice. This study analyzes novel microarray data from EXE-resistant tumors treated with (EXE + LBH589) or without (EXE only) treatment of LBH589. All animal research procedures were approved by the City of Hope Institutional Animal Care and Use Committee. This novel microarray data is available in GEO series GSE47346.
In order to be included in the novel BD-Func signature, genes must show differential expression in all 3 cell lines. Genes were defined as differentially expressed if they showed a |fold-change| > 1.5 and p-value < 0.05 , and the BD-Func signature genes had to meet these conditions for each of the 3 LBH589 cell line treatments (with consistent direction of fold-change). P-values were calculated via 1-way ANOVA with appropriate linear contrast was used to compare data sets using Partek ® Genomics Suite TM (Partek, Inc., St. Louis, MO). Fold-change values were calculated based upon the least-squares mean, and data was normalized using robust multichip average (RMA) normalization (Irizarry et al., 2003).

BD-Func shows equal or greater performance to GSEA and IPA for functional enrichment
Given the relative ease by which samples can be classified as having positive or negative activity for an individual biomarker, the accuracy of BD-Func was first tested by applying several MSigDB oncogenic signatures (Liberzon et al., 2011) to the datasets from which the signatures were defined (Claus et al., 2008;Elkon et al., 2005;Furuta et al., 2006;Padua et al., 2008;Wei et al., 2006). BD-Func was able to detect the activation or inhibition of all of these oncogenic signatures (Table 1, Fig. 2). GSEA could detect all of the signatures except the Claus et al. (2008) progesterone receptor signature. Among these 5 test datasets, IPA could only detect the activity of 2 of these genes; however, this is not a completely fair comparison because we would expect some overfitting of the MSigDB signatures for the GSEA and BD-Func analysis. Nevertheless, the significance of this analysis is that BD-Func can accurately detect perturbation of all of these biomarkers on datasets where we know that these specific genes will be altered.
In order to test the performance of BD-Func, GSEA, and IPA on novel datasets (which were not used by MSigDB to define gene lists), we applied the 3 algorithms to four datasets If the median t-test statistic is greater than 2, the box is colored red. If the median t-test statistic is less than −2, the box is colored green.
with TGFβ treatments (Table 2). All 3 algorithms showed roughly equal performance for predicting TGFβ treatment in the appropriate samples. Overall, analyses of these nine datasets indicate that BD-Func can provide similar quality results as GSEA and IPA.

BD-Func can determine the accuracy of novel predictive models
Six breast cancer datasets were also used to test the robustness of the progesterone receptor (PGR) signature (Anders et al., 2008;Chin et al., 2006;Finak et al., 2008;Huang et al., 2003;The Cancer Genome Atlas Network, 2012). Unfortunately, neither BD-Func, GSEA, nor IPA could predict progesterone status in all seven patient populations (Table 3). To be fair, the original Claus et al. (2008) dataset was used to define progesterone receptor status in meningioma samples whereas the novel datasets tested were all breast cancer samples (where testing for over-expression of progesterone receptor is common (Bardou et al., 2003)). Nevertheless, BD-Func is designed to be able to utilize custom gene signatures with activated and inhibited, so we defined a novel progesterone receptor signature using the expO dataset (GEO Series GSE2109). This signature can identify progesterone Table 1 Recovery of known perturbations for selected MSigDB oncogenic genes. BD-Func "Fold-Change" corresponds to analysis of fold-change values calculated between the perturbed and unperturbed groups. BD-Func "Intensity" corresponds to calculation of activity vs. inhibition score for each sample in the dataset followed by a comparison in the distribution of test statistics for all of the samples.  receptor positive and negative patients for all 7 cohorts (1 meningioma and 6 breast cancer datasets), so it is robust enough for application to multiple cancer types. Another unique feature of BD-Func is the ability to use the activation versus inhibition test statistic as a classifier to define a predictive model. If a t-test statistic of 2 is used for the cut-off of distinguishing the positive and negative populations (roughly corresponding to a p-value < 0.05), it is clear that the MSigDB signature is extremely accurate at predicting PGR status in the original dataset but not in the breast cancer datasets (Table S1). Likewise, the TGFβ signature could differentiate between the treated and untreated groups if the test statistic of 2 was used as the threshold to distinguish the groups (Fig. S2). However, this threshold does not work well in all circumstances: unlike the analysis of fold-change values, the p-value (for any statistical method) is not always the ideal statistic for assessment of functional enrichment on intensity values. For example, the mTOR and BRCA1 signatures Table 3 Prediction of progesterone receptor status in patient samples. In this table, BD-Func is used to analyze fold-change values between PR+ and PR− patients. ( Fig. 2B) show appropriate changes in test statistics that clearly distinguish treated and untreated groups, but activation and inhibition can't be defined based upon a pre-defined cut-off for the test statistic value (e.g., 2 or −2). For this reason, we provide an ANOVA p-value to quantify the difference in test-statistic between groups, where the test statistic serves as a score for a second calculation of statistical significance. Additionally, we believe that predictive statistics are a useful method for accessing BD-Func scores for individual samples within large patient populations. In order to quantify the accuracy of the model without depending on a pre-defined cut-off, BD-Func produces receiver operating characteristic (ROC) plots for each cohort and the area under the curve (AUC) is calculated for each ROC plot (where a 100% accurate model would have an AUC = 1.00) (Fig. S3, Table S1). The superior performance of the novel PGR signature on the breast cancer cohorts becomes even clearer when these predictive statistics are compared for the two models (Fig. 3, Fig. S4, and Table S2). Importantly, the novel signature showed the same level of accuracy on the TCGA breast cancer dataset as the original expO dataset. This is significant because the TCGA dataset is over twice as large as the expO dataset, and the TCGA dataset utilizes RNA-Seq while the expO dataset utilizes microarrays to quantify gene expression. In other words, this shows that BD-Func is capable of producing very robust predictions that translate across different genomic technologies.

BD-Func
BD-Func accurately quantifies LBH589 activity using a novel signature LBH589 (panobinostat) is a histone deacetylase inhibitor that has been previously shown to suppress the proliferation of aromatase inhibitor resistant breast cancer cells, which was Although the MSigBD gene set shows extremely high accuracy for the original meningioma dataset, it shows essentially random predictive power for the 6 larger breast cancer datasets. On the other hand, a custom progesterone receptor signature defined using the expO dataset shows high accuracy for all 7 cohorts, and the high accuracy is maintained even in the largest cohort (TCGA). a conclusion supported in part by functional enrichment analysis of commonly affected genes from 3 cell line experiments (Kubo et al., 2013). Gene lists derived from these cell line experiments can be easily used to define a BD-Func signature, so we hypothesized that the results from this previous in vitro cell culture study could be used to predict LBH589 activity in vivo in an animal study. Specifically, we asked if the signature defined based upon LBH589 treatment in 3 cell lines (H295R, MCF-7her2, HeLa) could detect LBH589 activity in a mouse xenograft from a different cell line (MCF-7aro xenograft treated with EXE). Indeed, BD-Func correctly used the cell line LBH589 signature to identify common gene expression changes in the tumours treated with LBH589 and EXE compared to the mice that were only exposed to EXE treatment (Fig. 4).

DISCUSSION
Comparison of BD-Func to GSEA indicated that BD-Func can provide similar oncogenic signature predictions with much shorter run time (Table S3) and a more direct comparison of genes that are expected to be up-or down-regulated by the oncogenic regulators. One limitation to BD-Func is that it can only conduct functional enrichment for regulators with genes that are both up-and down-regulated, so there are many gene lists in MSigDB that cannot be analyzed using BD-Func (which instead should be analyzed using a tool such as GSEA). It is also worth noting that BD-Func can work with a wide range of sizes of gene lists for activated and inhibited genes (Table S4), but we would recommend using at least a few dozen genes when defining custom signatures.
Comparison of the BD-Func oncogenic signatures to the IPA upstream regulators also showed that both programs could provide similar performance, which is not surprising giving the design of that module in IPA. One benefit to utilizing IPA is that IPA has a curated database which lists with a wider variety of regulators than the MSigDB oncogenic signatures that can be analyzed in BD-Func. In contrast, one major benefit to using BD-Func is the greater theoretical range of applications. For example, BD-Func provides an enrichment file for Gene Ontology (GO) categories (Ashburner et al., 2000), but IPA does utilize this same strategy of analyzing functional ontologies by comparing the expression of positively and negatively regulated genes.
The Connectivity Map is a commonly used tool to study gene expression profiles for drugs and other chemical perturbations (Lamb et al., 2006). There are no LBH589/panobinostat treatments in the Connectivity Map database (although this database can certainly provide other useful information), so BD-Func provides a unique opportunity to test for gene signatures that show a strong positive or negative correlation with novel drug treatments (such as LBH589). Additionally, BD-Func is compatible with any gene mapping (in this case, gene symbol), whereas the Connectivity Map requires users to define their signatures in terms of HG-U133A probes. For example, affected gene symbols had to be converted to HG-U133A probes for this analysis. We believe that being able to define signatures based upon gene symbol is a substantial practical benefit.
BD-Func also calculates a test statistic to represent functional activation or inhibition for each individual sample in a dataset, and this study shows how this statistic can be directly used as a classifier that can be used to quantify the predictive power of a given functional model. More specifically, this study shows the utility of using BD-Func for applying two novel predictive models (for progesterone receptor status in patients and for LBH589 drug treatment). The LBH589 signature provided biological confirmation that the results from an in vitro model can indeed apply to validation experiments in vivo. This is important because our hope is that the streamlined analysis, simple input file design, and BD-Func discussion board can be used to help scientists quickly and easily share novel predictive models. In short, BD-Func provides a novel framework for functional enrichment (by comparing the relative expression of activated versus inhibited genes) that is freely available with a user interface that is accessible to biologists without any coding experience. The results of this paper show that BD-Func provides accurate predictions matched by other popular tools, which make it a useful complement to standard analysis using tools like GSEA or IPA.