ApoptoProteomics, an Integrated Database for Analysis of Proteomics Data Obtained from Apoptotic Cells*

Apoptosis is the most commonly described form of programmed cell death, and dysfunction is implicated in a large number of human diseases. Many quantitative proteome analyses of apoptosis have been performed to gain insight in proteins involved in the process. This resulted in large and complex data sets that are difficult to evaluate. Therefore, we developed the ApoptoProteomics database for storage, browsing, and analysis of the outcome of large scale proteome analyses of apoptosis derived from human, mouse, and rat. The proteomics data of 52 publications were integrated and unified with protein annotations from UniProt-KB, the caspase substrate database homepage (CASBAH), and gene ontology. Currently, more than 2300 records of more than 1500 unique proteins were included, covering a large proportion of the core signaling pathways of apoptosis. Analysis of the data set revealed a high level of agreement between the reported changes in directionality reported in proteomics studies and expected apoptosis-related function and may disclose proteins without a current recognized involvement in apoptosis based on gene ontology. Comparison between induction of apoptosis by the intrinsic and the extrinsic apoptotic signaling pathway revealed slight differences. Furthermore, proteomics has significantly contributed to the field of apoptosis in identifying hundreds of caspase substrates. The database is available at http://apoptoproteomics.uio.no.

Apoptosis is the most commonly described form of programmed cell death, and dysfunction is implicated in a large number of human diseases. Many quantitative proteome analyses of apoptosis have been performed to gain insight in proteins involved in the process. This resulted in large and complex data sets that are difficult to evaluate. Therefore, we developed the ApoptoProteomics database for storage, browsing, and analysis of the outcome of large scale proteome analyses of apoptosis derived from human, mouse, and rat. The proteomics data of 52 publications were integrated and unified with protein annotations from UniProt-KB, the caspase substrate database homepage (CASBAH), and gene ontology. Currently, more than 2300 records of more than 1500 unique proteins were included, covering a large proportion of the core signaling pathways of apoptosis. Analysis of the data set revealed a high level of agreement between the reported changes in directionality reported in proteomics studies and expected apoptosisrelated function and may disclose proteins without a current recognized involvement in apoptosis based on gene ontology. Comparison between induction of apoptosis by the intrinsic and the extrinsic apoptotic signaling pathway revealed slight differences. Furthermore, proteomics has significantly contributed to the field of apoptosis in identifying hundreds of caspase substrates. The database is available at http:// apoptoproteomics.uio.no. Molecular & Cellular Proteomics 11: 10.1074/mcp.M111.010447, [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15]2012.
Apoptosis is the major form of programmed cell death and is essential for tissue homeostasis in organisms. It plays important roles in growth and development (1) and in the immune system (2), and it is associated with several diseases (3)(4)(5)(6). The balance of death and survival signals is coordinated to ensure quality control and viability in the organism. A surplus of death signals is associated with neurodegenerative diseases including Hun-tington disease, Alzheimer disease, Parkinson disease, and amyotrophic lateral sclerosis (7) because the death signals may not only lead to complete cell death but may also render cells dysfunctional. On the other hand, a surplus of survival signals is predominantly associated with cancer. Cancer cells have the ability to evade apoptotic signals and promote survival beyond their normal lifespan. This is a hallmark of tumorigenesis, and chemotherapy has been extensively used to induce cell death in tumor cells. Several chemical compounds exist to induce apoptosis in different cancer cells and via different mechanisms, including taxol, cisplatin, and etoposide. The ultimate goal of conventional chemotherapy is to design and utilize chemicals to induce severe cellular damage in rapidly dividing cancer cells and to trigger apoptosis with as few side effects as possible (8). In addition to chemotherapy, radiotherapy is commonly used for this purpose. Ionizing radiation can cause direct or indirect (via radiolysis) DNA damage to induce apoptosis, whereas UV light produces pyrimidine dimers, which cause a bend in the DNA helix, rendering DNA unreadable to the polymerase.
Proteomics has emerged as an important tool in the study of apoptotic cells and how different therapeutics affects the status of a cell (9). Typically, untreated cancer cell lines are compared with treated cells in a quantitative experiment, and the differences in protein abundances in these two proteomes are interpreted to gain insight in the action of the drug. Originally, many quantitative proteomics experiments were performed based on comparing spot intensities between two different two-dimensional gels. This was further improved within one gel by applying the two-dimensional difference gel electrophoresis (DIGE) technology (10). There has also been a significant improvement of mass spectrometers during the last decade with increasing sensitivity, accuracy, and speed. Consequently, MS-based quantification techniques have been applied to proteomics data sets, including e.g. stable isotope labeling of amino acids in cell culture (SILAC) 1 (11), tandem mass tagging TMT (12), isobaric tags for relative and absolute quantification ITRAQ (13), and isobaric peptide termini labeling IPTL (14). SILAC enables peptides derived from different physiological conditions to be quantified at the MS level, whereas tandem mass tagging, isobaric tags for relative and absolute quantification, and isobaric peptide termini labeling yield isobaric peptides and thus require MS/MS data to reveal quantitative information for the peptides. Furthermore, the efficiency of protein and peptide separation techniques prior to mass spectrometry has been improved. Subcellular fractionation allows for separating different cellular compartments and thus enables spatial proteomics, whereas enhanced on-line and off-line chromatography enables in-depth analysis of the proteomes. Finally, the end user being interested in single protein regulation events will have to compare identification and quantification data derived from a variety of techniques, instruments, cells, and apoptosis inducers.
Many different quantitative proteome analyses have been reported using different apoptosis inducers and proteomic approaches. To consolidate these data, we present the Apo-ptoProteomics database (APdb), which is a manually curated and integrated database, carefully gathered based on proteomic studies of apoptosis. We have systematically collected information regarding different MS instrumentation, peptide and protein separation techniques, quantification techniques, and more and integrated this with available gene ontology, pathway, and subcellular location protein annotations from UniProt-KB. The database is available on the web to allow the search and comparison of proteomic identifications and quantification information across experiments and to find previously reported data on proteins of interest.

EXPERIMENTAL PROCEDURES
Manual Curation-Publications describing large scale proteomic studies of apoptosis were extracted from PubMed. Publications were considered if at least 10 proteins were reported to be changed and if this information was available within the manuscript or as supplementary material. 52 publications fulfilled these criteria and were used as the basis for APdb (supplemental Table S1). The following information was manually retrieved from the publications: protein name, accession number, reported change, reported direction, PubMed identification number, first author, journal, year of publication, apoptosis inducer used, type of study, cell types used, organism, subcellular fractionation, protein and peptide separation techniques, quantification techniques, and type of mass spectrometer used. The reported regulation was always kept on the form apoptosis/nonapoptosis for consistency on the directionality of regulation.
Database Integration and Implementation-All of the apoptosis inducers were classified according to their function. All of the protein identifiers used in the publications were mapped and updated to the UniProt standards using the ID mapping function within UniProt-KB or protein-protein Blast (Blastp). In addition, data residing in the CAspase Substrate dataBAse homepage (CASBAH) (15) were integrated in APdb together with the following information from Uni-Prot-KB version 2011_02: UniProt identification number, protein short name, keywords, pathway, and subcellular location using an in-house python script. Gene ontology (GO) version 1.2047 and GO Annotations version 1.181 (human), 2011_06 (mouse and rat) were down-loaded from the GO Consortium and EBI, respectively. An in-house python script incorporating code from the GoaTools package was generated to calculate a directed acyclic graph of the GO terms and extract every protein's association with apoptosis, including recursively the basic parent-child relationship (is_a), the part-of relationship (part_of), and regulation relationships (positively_regulates and nega-tively_regulates). All of this metadata extraction and integration was done for every protein found in the 52 publications prior to deployment to facilitate fast response to searches on the web server. The database was stored as a flat tab-separated file to enable the powerful model-view-controller design pattern in modern programming. The APdb will be updated twice per year to incorporate new apoptosis-proteomics-related publications, changes to UniProt-KB, CAS-BAH, GO, and GO Annotations.
Web Implementation-The web interface was written in Cϩϩ using Wt libraries and the model-view-controller design pattern, and it supports access to the database through searching, filtering, comparing, browsing, and exporting. Searching and filtering can be limited to specific database fields such as protein name, journal, reported regulation, GO, etc. In addition, advanced filtering can be applied to extract the proteins where all studies report the same directionality of regulation. Advanced filters also exist to extract proteins where the reported regulation agrees with or disagrees with GO metadata. The database is available at http://apoptoproteomics. uio.no.
Bioinformatics-Over-representation analysis was performed using DAVID Bioinformatics Resources version 6.7 (16,17) utilizing either GO terms describing biological processes (GOTERM_BP_FAT) or the pathway databases BBID, Biocarta, EC_number, Kyoto Encyclopedia of Genes and Genomes (KEGG), Panther, and Reactome for pathway analysis. Functional annotation in DAVID was done using the whole human genome as background, and clustering was performed using the medium stringent clustering option. In addition, Cytoscape (18) version 2.7.0 was used with the plug-in BiNGO (19) version 2.42 for visualization of GO over-representation analysis and the plug-in iRefScape (20) version 0.94 for protein-protein interaction (PPI) analysis. BiNGO analysis was performed using ontology version 1.1787 (20365 terms for biological_process) and annotation version 1.181 for Homo sapiens (229244 annotations) downloaded from the GO Consortium. Excluded from BiNGO analysis were GO associations not reviewed by a curator (inferred from electronic annotation), purely sequence-based mapping (inferred from sequence or structural similarity) and cases where no evidence were found (no biological data available). Caspase substrate analysis was performed utilizing the CASBAH (15) and the Server for SVM Prediction of Caspase Substrates Cleavage Sites (CASVM) (21), respectively. The software GraBCas (22) was applied to predict additional caspase substrate cleavage sites for the effector caspase-3, -6, and -7.

Description of the ApoptoProteomics Database-The
APdb is a manually curated and integrated database to consolidate published data from large scale proteomics studies of apoptosis, and it enables easy access to previously reported information regarding proteins of interest. The database can be accessed through a search function that supports both text matching and regular expressions, as well as filtering to specific parts of the database (Fig. 1). In addition, a browse function has been developed where the user can click on different parts of the apoptotic signaling pathway and be guided to related published proteomics results (Fig. 2). A selected protein from different studies can be compared con-sidering, for example, apoptosis inducers, reported protein regulation, and proteomics workflow. APdb also supports exporting of search results as tab-separated text files to facilitate further analysis. We have taken the data as described in the publications, and APdb currently contains information gathered from 52 studies resulting in 2384 records of 1502 unique proteins. Notably, 204 of these proteins have an association with apoptosis utilizing GO. The information in APdb covers three unique mammalian organisms (human, mouse, and rat), 32 different cell types, and 43 different apoptosis inducers. Furthermore, two-dimensional electrophoresis/MS was used in 36 of the studies, 13 times GeLC/MS, and four times in-solution shotgun LC-MS. For gel-based quantification, silver and Coomassie Blue dye staining has been used 28 times, and DIGE has been used seven times. MS-based quantification has been performed five times using SILAC and once each ICPL, 2-MEGA, and spectral counting. Furthermore, 11 of the studies analyzed subcellular purifications.
Analysis of the ApoptoProteomics Database-We counted first the frequency of all proteins in APdb and found that 403 of the 1502 unique proteins were reported more than once. Human vimentin (P08670) was the most frequently reported protein listed 17 times in APdb (Table I). The ten most frequently observed proteins comprehended four cytoskeletal proteins (vimentin, lamin B1, actin cytoplasmic 1, and myosin-9), four heterogeneous nuclear ribonucleoproteins (hnRNPs), rho-GDI 2, and 60 S acidic ribosomal protein P0 (Table I). A common feature of these proteins is that they are all known caspase substrates.
We extracted the proteins that were reported more than once and coincided on directionality change in the different studies (supplemental Table S2). This proceeding yielded in a list of 98 proteins, 66 up-regulated, and 32 down-regulated, across the species human, mouse, and rat. 19 of them showed involvement in apoptosis utilizing GO metadata, whereas 79 proteins had no known association with apoptosis yet.
GO metadata was utilized to categorize all of the proteins into apoptosis promoters and inhibitors. Of the 1502 unique proteins, 204 showed an association with apoptosis, of which  128 were proteins with regulating functions in apoptosis. Of these, 61 were negative regulators, 55 were positive regulators, and 12 were characterized as both. 90 of these were reported having a directionality change during apoptosis (supplemental Table S3). Apoptosis promoters were reported up-regulated in 36 (86%) and down-regulated in 20 (48%) of the 42 cases. Inhibitors were reported down-regulated in 36 (62%) and up-regulated in 41 (71%) of the 58 cases. Note that 30 proteins are reported in APdb both up-and down-regulated, and 10 proteins are known to be both promoter and inhibitor according to GO apoptosis regulation. In total, 76% showed agreement between experimental directionality change and GO metadata. Global Proteome Analysis-To obtain an overview of protein classes over-represented in APdb, we performed an enrichment analysis of all proteins in APdb with respect to their GO terms describing biological processes using BiNGO (Fig.  3). BiNGO is a Cytoscape plug-in able to determine significantly over-represented terms in a list of proteins and allows for visualization and control of GO evidence codes. Within the BiNGO output, clusters could be found for both apoptosis and regulation of apoptosis, as well as several cellular processes with clear association with apoptosis including "regulation of caspase activity", "regulation of cell cycle and arrest", "proteasomal protein degradation", and "nucleosome and chromatin assembly or disassembly".
To detect differences between up-and down-regulated proteins, the bioinformatics tool DAVID was applied to elucidate enrichment of biological annotation clusters of GO terms describing biological processes. DAVID systematically maps a number of proteins to their associated biological annotation and highlights the statistically over-represented terms. In addition, DAVID is able to cluster similar annotation terms into groups of terms rather than a linear list, which removes redundancy and eases the biological interpretation (16). These clusters obtain an enrichment score based on the p values of the individual terms in that cluster. Table II shows the 10 most significant clusters for both up-and down-regulated proteins. Several clusters are significantly enriched in both sets including "RNA processing", "nucleosome/chromatin assembly or disassembly", "protein complex assembly", "DNA repair", "apoptosis", "cytoskeleton organization", and "glycolysis". The main difference between the two sets was that for upregulated proteins, one cluster described as "positive and negative regulation of apoptosis" was enriched, i.e. both apoptosis promoters and inhibitors, whereas the cluster described as "anti-apoptosis and negative regulation of apoptosis" was separated with a higher enrichment score from "positive regulation of apoptosis" for down-regulated proteins.
Pathway analysis was performed using DAVID using the databases BBID, Biocarta, EC_number, KEGG, Panther, and Reactome, for all of the human proteins in APdb (Table III). We analyzed whether the proteins in APdb could be mapped to apoptosis-related over-represented pathways and performed separated analysis for up-and down-regulated proteins. Indeed, the pathways "apoptosis", "FAS signaling pathway", "TNFR1 signaling pathway", and "caspase cascade in apoptosis" were found to be enriched. In addition to these four apoptosis-related pathways, 27 other pathways were also identified as over-represented, including some unexpected pathways such as the influenza infection pathway and pathogenic Escherichia coli infection pathway. These pathways are seemingly not connected to apoptosis, but the proteins assigned to "influenza infection" overlapped with 43 proteins of "ribosome" and 16 of "splicosome".
PPI analysis was performed using Cytoscape with the iRef-Scape plug-in. iRefScape is a tool for protein interaction visualization that uses iRefIndex (20), a consolidated interaction database that incorporates interaction data from 10 interaction databases including BIND, BioGRID, CORUM, DIP, HPRD, IntAct, MINT, MPact, MPPI, and OPHID. The human proteins in APdb were separated into a set of proteins with no known binary interactions within APdb (187 proteins) and a huge massively interconnected cluster (supplemental Fig. S1). This cluster contained 767 proteins, and 3602 binary interactions were detected. The three proteins with the most interactions were 14-3-3/␦ (151 interactions), 14-3-3␥ (71 interactions), and mitogen-activated protein kinase 13 (64 interactions). Filtering on only proteins having "apoptosis" as a GO term (red color in supplemental Fig. S1) revealed that the three proteins with most interactions were 14-3-3/␦ (33 interactions), caspase-3 (23 interactions), and vimentin (18 interactions). Furthermore, the cluster "apoptosis" from Fig. 3 was used to visualize the PPIs of the apoptosis-associated proteins in APdb. This cluster contained 67 proteins, and 54 of these had known binary interactions internally within this cluster (Fig. 4). This interaction map showed clearly the extrinsic pathway all the way from the Fas receptor to caspase cleavage of cellular components. In addition, two other clusters were found that contained components of the proteasome and the nuclear pore complex.
Comparison of Proteome Changes Caused by Induction of the Intrinsic and Extrinsic Pathway-We analyzed whether different classes of proteins were identified by proteome analyses induced either by the intrinsic or the extrinsic pathway. To assess this question, we selected the proteins in APdb induced by agents known to activate either the extrinsic pathway (292 proteins) or the intrinsic pathway (692 proteins) for separate BiNGO enrichment analyses using GO terms describing biological processes (supplemental Fig. S2). Most of the clusters from Fig. 3 were present in both pathways, such as "RNA processing", "apoptosis", "regulation of apoptosis", "regulation of protein ubiquitination", and "cell cycle". However, some minor differences were observed between the pathways: "NF-B regulation" and "regulation of STAT phosphorylation" were only present in the extrinsic pathway, whereas "regulation of caspase activity", "developmental pro- FIG. 3. GO term over-representation analysis of biological processes. Over-representation analysis of all proteins in APdb is shown, with respect to their GO terms describing biological processes, analyzed with the Cytoscape plug-in BiNGO to visualize the significance of over-representation using node color gradient from 5.00E-2 (yellow) to Ͻ 5.00E-7 (orange). Several clusters describing biological processes linked to apoptosis are over-represented.

The ApoptoProteomics Database
10.1074/mcp.M111.010447-6 cesses", "signaling through epidermal growth factor receptor, nerve growth factor receptor, and insulin receptor", "energy generation", and "gluconeogenesis and glycolysis" were found only in the intrinsic pathway. Stress responses were mainly overlapping between the pathways, but "response to reactive oxygen species" could only be found in the intrinsic pathway.
Major Proteins Involved in Apoptosis Signaling-We analyzed to which extent the major components of the core apoptotic signaling pathway were identified by the proteomics studies (Table IV). Parts of the apoptosome, caspases, death receptors, DNA fragmentation factors, and mitochondria-related proteins are present as identified proteins in APdb, whereas less abundant proteins like the Bcl-2 family and inhibitors of apoptosis (IAPs) are present as studies, meaning that these proteins have been specifically analyzed using small interfering RNA knockdown, antisense, transgenic, or transfection experiments. Beside the core apoptotic signaling pathway, additional proteins involved in regulating apoptosis were identified including adaptor proteins, apoptosis inducers, apoptosis inhibitors, heat shock proteins, components of the proteasome, protein kinases, protein phospha-tases, and 14-3-3 family members (supplemental Table S4). In the web implementation, most of the components of the core signaling pathway are available through the browse function, where the user can click different parts of the pathway and be guided to published proteomics results (Fig. 2).
Caspase Substrates-Five different proteomics approaches were published to specifically identify caspase substrates yielding a list of 523 unique proteins as potential substrates (23)(24)(25)(26)(27). We compared these proteins with the CASBAH (15) and the CASVM (21) and found 474 proteins to be overlapping (Fig. 5). The remaining 49 proteins, being unknown potential caspase substrates, were further analyzed for cleavage by the apoptotic effector caspase-3, -6, and -7 using the software GraBCas with high specificity (score cutoff ϭ 20). 14 of these proteins were predicted to contain caspase cleavage sequence motifs with a high score (supplemental Table S5).

DISCUSSION
The ApoptoProteomics Database-We have created the APdb, an integrated searchable database that allows searching and browsing the apoptotic signaling pathway and guides to proteomics publications of proteins of interest. The database is freely available on the internet and downloadable as a text file for further investigation. APdb is protein centric and allows comparisons of proteins identified in different studies. As an example, searching APdb for the anti-apoptotic protein 14-3-3/␦ yields two records of this protein reporting down-regulation in quantitative proteomics experiments (28,29), once utilizing SILAC and once two-dimensional DIGE for quantification of proteins. Currently, 52 publications contribute to the 1502 unique protein identifications.
Unique Proteins and Reported Quantitative Changes-Of the unique proteins in APdb, only approximately one-fourth was reported by more than one study. However, different experimental setups have been used that are likely to generated divergent results, e.g. apoptosis was induced by 43 different compounds in 32 different cell types. Another reason why this common set of proteins is as small as 403 proteins could be the large dynamic range and complexity in cellular proteomes. This typically leads to a systematic oversampling of abundant proteins while low abundant proteins remain undetected. Nevertheless, the set of common proteins has an over-represented population of proteins involved in the regulation of apoptosis, which is not the case for the list of proteins reported only once (DAVID over-representation analysis; data not shown). This population includes caspase-3, mitochondria-associated proteins (HtrA2, Diablo, and VDAC-1), apoptotic protease-activating factor-1, and proteins of the ubiquitin-proteasome pathway. We presume that this common basis will increase as new data will be added to APdb acquired with increasingly more sensitive instrumentation and methodology. Further analysis of the proteins reported by more than one study revealed 98 proteins where the different publications agree on the directionality of change, 80% of these with no recognized association to apoptosis. Several publications describing the same protein being regulated in a distinct direction during apoptosis suggest association to apoptosis, either with a regulatory role or immediately affected by the cellular state. As an example, serine/arginine-rich splicing factor 1 (Q07955, Q6PDM2), also known as SF2/ASF, is a splicing factor recently described as an oncoprotein and found over expressed in primary non-small cell lung cancer tumors (30). SF2/ASF was found to specifically bind survivin mRNA and enhance its translation. Survivin, a member of the IAP protein family, is an anti-apoptotic protein frequently up-regulated in cancers, and in vitro down-regulation of SF2/ASF was found to induce apoptosis in non-small cell lung cancer cell lines, an effect associated with reduced expression of survivin. In APdb, SF2/ASF has been reported down-regulated in apoptosis four times, in three different cell types, suggesting that SF2/ASF could play a role in apoptosis by regulating survivin expression. Another example is RuvB-like 1 (Q9Y265), which was reported to be up-regulated four times in three different cell types and four different apoptosis inducers. RuvB-like 1 is an ATPase and a required constituent of the NuA4/TIP60 complex, a histone acetyltransferase with histone H4/H2A histone acetyltransferase activity (reviewed in Ref. 31). TIP60 also acetylates nonhistone proteins including androgen receptor and p53. Acetylation of lysine 120 of p53 supports the decision between cell cycle arrest and apoptosis and is required for binding of p53 on promoters of pro-apoptotic genes. An up-regulation of RuvB-like 1 then suggests an apoptosis-promoting function through the TIP60 complex and p53-mediated apoptosis. Moreover, chloride channels have been suggested to play an important role in apoptosis in signal transduction cascades, and chloride channel blockers have shown inhibition of apoptosis (reviewed in Ref. 32). Interestingly, APdb reports chloride intracellular channel protein 1 (O00299) as up-regulated four times. Presumably, several more of these 79 proteins showing consolidated protein regulation and having unrecognized association to apoptosis may indeed be involved in this process. Further analyzing APdb, we have found that several proteins were reported frequently in apoptosis proteomics studies. The most frequently occurring protein was vimentin, which is listed 19 times in APdb (17 times for human and twice for mouse). Vimentin was reported to undergo changes during apoptosis by 18 different studies and published for the first time in 1999, but according to its GO annotation history, it was not recognized in apoptosis until May 2010. The GO annotation of a protein will be under constant change as more information about protein function is unveiled through scientific experiments, and the results of annotation-based analysis must be interpreted in this context. We therefore assume that FIG. 4. Protein-protein interaction map of apoptosis-related proteins in the ApoptoProteomics database. Shown is a protein-protein interaction map of the proteins in APdb that clustered together the "apoptosis" cluster from Fig. 3, analyzed with the Cytoscape plug-in iRefScape. Only proteins with binary interactions are shown. The extrinsic apoptotic pathway is highlighted (green lines) starting from FAS (TNR6_HUMAN, purple) to caspase-8, which further activates caspase-3, -6, and -7 (blue) and finally cleaves target caspase substrates (green). Blue, all caspases; cyan, apoptosome; red, proteasome; yellow, nuclear pore complex. Furthermore, all known caspase substrates from the CASBAH are colored green. The size of the protein name reflects the number of known binary interactions within this map. more proteins in APdb will over time be associated with apoptosis than the current 204 proteins. According to Byun et al. (33), vimentin is organized into an extensive network of cytoplasmic intermediate filaments that upon apoptosis is disassembled by a range of caspases. Vimentin is cleaved into several similarly sized fragments by multiple caspases, where proteolysis at Asp-85 by caspase-3 or caspase-7 generates a pro-apoptotic N-terminal fragment whose ability to induce apoptosis is dependent on caspases. In APdb, vimentin is reported as being both up-and down-regulated with a ratio ranging from 0.05 to 11.1. However, quantifying a caspase substrate represents a challenge because the different species of the protein show different regulation. Before cleavage, vimentin exists predominantly in its full-length version, and consequently the amount of the full-length version is reduced by cleavage, whereas the different fragments are increased in amount in the apoptotic state. We suspect that the diverging ratios are a result of which fragment of vimentin was identified in the different studies, an observation which might also be true for other proteins.
Considering the proteins in APdb that have been reported by more than one study, 181 proteins showed disagreement on the directionality between studies. Although some of this disagreement can be the effect of proteolytic cleavage during the execution phase of apoptosis, some proteins may have temporal or spatial changes, or some may exhibit both proapoptotic and pro-survival functions in different cell types or to different stimuli. For example, the heat shock protein family consists of high and low molecular weight chaperones and mostly possesses pro-survival roles by inhibiting release of mitochondrial factors or inhibiting the activated caspases (34). Consistently, heat shock proteins are found to be frequently overexpressed in various cancers. However, 60-kDa heat shock protein (P10809), a mitochondrial matrix protein that facilitates folding of newly translated mitochondrial proteins, showed an additional, opposite role and was found to pro-mote apoptosis by involvement in caspase-3 maturation and activation (34). Thus, this protein shows both pro-apoptotic and pro-survival functions, and intriguingly is reported in APdb being up-regulated five times and down-regulated seven times. This demonstrates that even a disagreement between studies on the directionality of protein changes can still reflect the correct biological processes.
Categorization of the proteins in APdb into apoptosis promoters and inhibitors was performed to study the agreement between experimentally derived directionality change and known function in apoptosis. Directionality agreement means that proteins involved in positive regulation of apoptosis were reported up-regulated, whereas proteins involved in negative regulation were reported down-regulated. Actually, 76% of the cases with experimental directionality showed agreement with GO metadata. For example, apoptotic protease-activating factor 1 (O14727) is a pro-apoptotic protein that, together with cytochrome c, generates a caspase-activating complex that recruits and activate caspase-9. This complex of apoptotic protease-activating factor-1 and activated caspase-9 is known as the apoptosome and has a crucial role in mediating cell death (35). Apoptotic protease-activating factor-1 has been reported in APdb twice up-regulated, and given its pro-apoptotic role, an up-regulation in apoptosis is expected and thus constitutes an agreement between experimental data and GO metadata. Another example of agreement is the BAG family molecular chaperone regulator 3 (O95817), which binds to Bcl-2 and enhances the anti-apoptotic activity (36). This protein has been reported to be down-regulated during apoptosis. Furthermore, caspase-3 (P42574) is reported in APdb as both up-and down-regulated and exhibits a pro-apoptotic function being an effector caspase and anti-apoptotic by cleavage of presenilin-2 (P49810) at Asp-329, generating a negative feedback (37). Of interest are also the 21 proteins showing disagreement with GO. As we anticipated because of the delicate balance of signals initiated by drug-induced apoptosis, several apoptosis inhibitors are also FIG. 5. Caspase substrate comparisons. A Venn diagram of caspase substrates reported in the APdb, the CASBAH, and the CASVM. Almost all of the reported caspase substrates in APdb and CASVM are already covered in CASBAH. Of the 49 proteins unique to APdb, 14 were predicted by the software GraBCas to harbor cleavage motifs for the effector caspase-3, -6, and -7 within their sequence.

TABLE IV The presence of components of the major pathways of apoptosis in APdb
See also supplemental Table S4 for apoptosis-associated proteins in APdb that are not in the core apoptotic signaling pathway. up-regulated during apoptosis. Actually, 71% of the disagreeing proteins are negative regulators of apoptosis.
Survey of the apoptosis signaling pathway as it is listed in the KEGG (38) revealed as the main components death ligands and receptors, adaptor proteins, Bcl-2 family, mitochondrial release factors, caspases, and caspase-activated DNases. In addition, APdb contains IAPs, inhibitor of caspase-activated DNases, kinases and phosphatases, and the p53 signaling pathway. Furthermore, the ubiquitin-proteasome pathway has been shown to regulate several members of the Bcl-2 family, IAP family, and IB (39), thus affecting apoptosis. Members of the 14-3-3 family have been reported to interact with the apoptotic machinery (40). A large repertoire of proteins involved in these processes was identified in proteomics studies, but less abundant proteins are yet to be identified. Large scale proteomics experiments detect predominantly abundant proteins, whereas detecting low abundant components such as the Bcl-2 family most likely requires a targeted proteomics approach, e.g. utilizing a different instrument setup such as selected reaction monitoring in combination with subcellular fractionation protocols to enrich, for example, cytoplasmic and mitochondrial proteins.
Over-representation Analysis of Data in APdb-Considering the BiNGO overview of over-represented GO terms resulted in several annotation clusters with clear relation to apoptosis. Apoptosis is in many experimental setups induced using a drug with causes DNA damage (see clusters "DNA damage and integrity checkpoints" and "regulation of cell cycle and arrest"). The cells respond to this event by halting the cell cycle (see clusters "cell cycle" and "DNA replication") and trying to repair DNA. In case of damage that cannot be repaired, the cell enters apoptosis (see clusters "apoptosis" and "regulation of apoptosis"), and caspases are activated (see cluster "regulation of caspase activity"). During apoptosis, the cytoskeleton is reorganized and broken down (see clusters "cytoskeleton organization" and "regulation of cytoskeleton organization"), histones are released from DNA and chromatin condensed (see cluster "nucleosome and chromatin assembly or disassembly"), DNA is fragmented (see cluster "nucleobase, nucleoside, and nucleotide metabolic processes"), protein synthesis is shut down (41) (see clusters "translation" and "regulation of translation"), and ribosomes are structurally altered (41) (see cluster "ribosome biogenesis"). Notably, the large scale analysis of caspase substrates by Mahrus et al. (27) revealed similar biological processes.
Some of the overlapping clusters in the separate up/down DAVID analysis are also found in the BiNGO overview, including "cytoskeleton organization," "RNA processing," and "nucleosome/chromosome assembly or disassembly." Actin microfilaments, intermediate filaments, and microtubules are the major constituents of the cytoskeleton in eukaryotic cells and are easy to detect using mass spectrometry because of their high abundance. Induction of apoptosis using the cytostatic drug taxol has shown to induce reorganization of vi-mentin filaments (42), and reorganization of actin, vimentin, and tubulin leads to cell morphology changes seemingly connected with characteristic features of apoptosis like membrane budding (43,44). Therefore, we were not surprised to find "cytoskeleton organization" as an enriched cluster in APdb. Furthermore, both BiNGO and DAVID analysis showed enrichment of proteins involved in "RNA processing" and "regulation of translation." Proteins in these clusters included predominantly hnRNPs, ribosomal proteins, and splicing factors. hnRNPs are RNA-binding proteins involved in transcription, splicing, stabilization, and translational regulation (45). In addition, they are among the most abundant proteins in the eukaryotic nucleus and have shown translocation from nucleus to cytosol and mitochondria during apoptosis (46). Actually, a major rearrangement of ribonucleoprotein complexes occurs during apoptosis including extrusion from the nucleus to cytoplasm (47). In APdb, 128 records of hnRNPs are listed, identifying 29 unique proteins in this family (21 human, seven mouse, and one rat). According to the CASBAH, 16 of these hnRNPs are reported as caspase substrates. The reported changes in protein abundance in proteomic studies may reflect several cellular processes including cleavage by caspases, protein expression, and protein translocation. Consequently, the occurrence of many hnRNPs in APdb leading to enrichment of these terms is due to both caspase cleavage and translocation. The cluster "nucleosome and chromatin assembly or disassembly" comprehended predominantly histones, components of chromatin and nucleosome remodeling complexes, and proteins involved in chromatin condensation. Nuclear DNA is veiled around histone octamers into nucleosomes and further into the structure known as chromatin. Chromatin condensation and DNA fragmentation in apoptotic cells has been shown to be accompanied by destruction of nucleosomes, which releases histones from apoptotic chromatin (48). Comparing cellular extracts from apoptotic versus nonapoptotic cells will therefore likely show an increased abundance of histones in response to apoptosis. In summary, global analysis of proteomics publications regarding apoptosis identified the major processes occurring in the cells during apoptosis in addition to the major apoptosisregulating proteins.
DAVID analysis showed only slight differences in enrichment between up-or down-regulated proteins, i.e. most of the found clusters are overlapping. However, for up-regulated proteins, GO terms describing both positive and negative regulation of apoptosis are clustered together as one cluster with enrichment score 4.5, although for down-regulated proteins, there seems to be a higher enrichment of apoptosis inhibitors, because anti-apoptosis and negative regulation was separated from positive regulation by a score difference of 4.1. Analysis of APdb showed that comparing experimental directional change with GO metadata, 76% of the cases agreed on the function, i.e. up-regulated suggested a proapoptotic function and down-regulated an anti-apoptotic function. Here, we observed that up-regulated proteins seemed to include both apoptosis promoters and inhibitors, whereas down-regulated included more apoptosis inhibitors. This also corroborates the above fact that for the 24% disagreeing cases, 71% were apoptosis inhibitors, i.e. proteins were up-regulated while they were assigned a negative regulatory function.
Pathway Analysis-Pathway analysis is often used to detect divergent pathways between experiments or different regulation of specific pathway components. We investigated whether the proteins in APdb could be mapped to well known apoptosis signaling pathways and whether any different enriched pathways occurred between up-and down-regulated proteins. 32 pathways were found to be enriched, and 12 were overlapping between the two sets. As discussed above, global proteomics analysis will not just identify apoptosis specific processes, but rather all processes occurring in the cells during apoptosis. Consequently, it was not surprising to find over-represented pathways such as "ribosome," "gene expression," "splicosome," and "cell cycle." The 32 overrepresented pathways comprehended 301 proteins in total, of which 63% were assigned to more than one pathway. The three proteins involved in most pathways were caspase-3, caspase-8, and ubiquitin (seven pathways each). This emphasized that the proteins in APdb are multifunctional, i.e. they are involved in more than one cellular process. Nevertheless, some clear differences between up-and down-regulated proteins were found. Except for "apoptosis," the other three apoptosis-related pathways: "FAS signaling pathway," "TNFR1 signaling pathways," and "caspase cascade in apoptosis," were only enriched for the up-regulated proteins. These pathways are pro-apoptotic and thus support the above findings that pro-apoptotic proteins are predominantly up-regulated. In addition, we found that the signaling pathways for epidermal growth factor and fibroblast growth factor were only present in down-regulated proteins. Epidermal growth factor and fibroblast growth factor receptor signaling are usually associated with anti-apoptosis and may inhibit apoptosis by the activation of pro-survival signaling through phosphoinositide 3-kinase and the protein kinase Akt (49). We have observed that anti-apoptotic proteins have been reported as both up-and down-regulated; however, this pathway analysis suggest that these anti-apoptotic pathways still are down-regulated during apoptosis.
Protein-Protein Interactions-Proteins rarely function as stand-alone units in the cell but often interact with other proteins. Functional interactions can be sharing common substrates in a pathway, regulating each other transcriptionally or indirect binding through participation in larger complexes (50). Although detecting these interactions through PPI analysis previously has suffered from problems with identifier mapping (51), iRefIndex assigns Sequence Global Unique Identifiers (SEGUID) to every protein based on the protein sequence to ensure that proteins with the exact same sequence will be represented only once. Using iRefScape, the Cytoscape plug-in that utilizes the iRefIndex database, we visualized first the binary interactions within all human proteins in APdb. This interaction map showed that 187 proteins had no binary interactions, i.e. they are most likely interacting though neighbors, whereas the remaining 767 proteins were massively interconnected. It was not possible to visually detect any clusters within this network, because of the size and interconnectivity. However, considering only apoptosis-related proteins using the GO cluster "apoptosis" from Fig. 3, we were able to trim down our network to only 54 proteins, and the extrinsic pathway and two clusters could be seen beside components of the proteasome and nuclear pore complex (Fig. 4). The proteasome is a multicatalytic proteinase complex that degrades proteins specifically targeted by ubiquitination (52). In apoptosis, it has been suggested that the ubiquitin-proteasome system may exhibit its role on the BH3only proteins, because these proteins are upstream of mitochondrial outer membrane permeabilization and that regulation by the ubiquitin-proteasome indeed can determine whether a cell lives or dies (53). Loss of any of the multiple core subunits of the 20 S proteasome has shown loss of cell viability (54), and inhibition of proteasome function during apoptosis has been reported through caspase-dependent proteolysis of proteasomal subunits (55,56). Interestingly, proteasomal subunits were reported in APdb to be both up-or down-regulated or reported as caspase substrates during apoptosis. Nuclear transport provides an important level of control for regulating apoptosis (57). Inhibition of active nuclear import has shown to prevent nuclear apoptosis, and several key proteins have shown translocation across the nuclear membrane: caspase-2, caspase-3, caspase-6, Fasassociated death domain, TNFR1-associated DEATH domain protein (TRADD), cytochrome c, AIF, and IAPs (reviewed in Ref. 57). The nuclear pore complex is embedded in the nuclear envelope, a dynamic structure that is degraded during apoptosis and leading to the typical nuclear pore complex clustering observed in apoptosis. The nuclear transport receptors importin ␣ and ␤ found in our cluster have shown degradation in both caspase-and proteasome-dependent manners during apoptosis (58), a process that indeed can lead to up-or down-regulation in apoptosis as discussed above for vimentin.
Comparison of Proteome Changes Caused by Induction of the Intrinsic and Extrinsic Pathway-The extrinsic pathway is commonly described as activation of death receptors such as Fas (CD95/Apo-1) by its ligand Fas-L (CD95L/Apo-1L), which recruits the adapter protein Fas-associated death domain. Fas-associated death domain binds the intracellular death domain of the Fas receptor and procaspase-8 to form the death-inducing signaling complex. This leads to auto-cleavage and activation of caspase-8, an initiator caspase known to activate the effector caspase-3, -6, and -7 (59). In addition, caspase-8 has shown cleavage of the pro-apoptotic BH3-only protein Bid into its truncated version tBid, which further is N-terminally myristoylated and translocates to the mitochondrial membrane to induce mitochondrial outer membrane permeabilization (60). Cleavage of Bid by caspase-8 thus constitutes a cross-talk between the extrinsic and the intrinsic pathway in type II cells (61). Although this cross-talk exists, we observed some differences such as regulation of caspase activity and response to reactive oxygen species significantly enriched in the intrinsic pathway.
Caspase Substrates in APdb-Caspases are a family of proteins that are one of the main executors of the apoptotic process. These cysteine proteases reside in the cells as inactive zymogens that upon cleavage are activated to exhibit their function. Caspases are categorized into initiator caspases and effector caspases. Initiator caspases are activated by recruiting to high molecular activation platforms and further proteolytically activate the effector caspases. The effector caspases are responsible for cleavage of key cellular proteins such as cytoskeletal proteins that leads to the morphological changes we define as apoptosis. Different approaches for the large scale study of caspase cleavage events were developed and resulted in the identification of several hundreds of caspase substrates (23)(24)(25)27). The CASBAH (15) is a web resource of known caspase substrates and currently contains 783 unique records of caspase substrates. In addition, several bioinformatics tools exist to predict caspase substrate cleavage sites within proteins, including PEPS (62), CaSPredictor (63), GraBCas (22), CASVM (21), and Pripper (64). PEPS, CaSPredictor, and GraBCas are based on scoring matrices to detect cleavage sites, whereas CASVM and Pripper use pattern recognition to detect the cleavage sites. Currently, CASVM contains 231 unique records of caspase substrates, of which 96% are already covered by CASBAH (Fig. 5). Comparing the 523 proteins in APdb that were reported as caspase substrates to both an experimental literature-based database (CASBAH) and a prediction-based database (CASVM), an overlap of 91% was found. However, using the score-based prediction tool GraBCas, we found an additional 14 proteins having cleavage motifs within their sequence. Currently, only a few hundred caspase substrates are known, but during apoptosis several thousand proteins in the cell are degraded. Using Pripper, Piippo et al. (64) estimated that ϳ69% of the human proteins had at least one putative caspase cleavage site, equal to ϳ66.500 putative caspase substrates in the human proteome.
In conclusion, APdb represents a useful tool to consolidate data from large scale proteome analyses of apoptotic cells. Currently, APdb contains data from 52 publications that applied proteomics technology to study apoptosis and covers a large proportion of the apoptotic signaling pathway. It is available as an intuitive web-based database to compare proteomic identifications and quantification information across experiments and to easily find previously reported data on proteins of interest. We have shown that consolidated data for protein directional change during apoptosis may shed new light on proteins without previously recognized involvement in apoptosis (based on GO). Furthermore, a high level of agreement was observed between the reported changes in directionality reported in proteomics studies and expected function related to apoptosis. Of course, the database can be evaluated with different bioinformatic tools to gain further insight in the outcome of proteomics experiments of apoptosis.