The Promise of Proteomics in the Study of Oncogenic Viruses*

Oncogenic viruses are responsible for about 15% human cancers. This article explores the promise and challenges of viral proteomics in the study of the oncogenic human DNA viruses, HPV, McPyV, EBV and KSHV. These viruses have coevolved with their hosts and cause persistent infections. Each virus encodes oncoproteins that manipulate key cellular pathways to promote viral replication and evade the host immune response. Viral proteomics can identify cellular pathways perturbed by viral infection, identify cellular proteins that are crucial for viral persistence and oncogenesis, and identify important diagnostic and therapeutic targets.

Systems Virology and Viral Proteomics-Proteomics is the identification and characterization of a collection of proteins found in a specific condition or circumstance. As outlined in Fig. 1, viral proteomics can define widely different collections of proteins related to viral infection. These proteomes can range from the proteins in a virion particle, to the cellular proteins found in complex with a particular viral protein, to global changes in cellular proteins in a diseased tissue following viral infection. This article will describe some of the major advances and challenges in viral proteomics, with particular emphasis on oncogenic DNA viruses.
Oncogenic Viruses-About 15% of human cancers are caused by viral infection (3). Seven human tumor viruses have been described to date: Epstein-Barr virus (EBV 1 or HHV4); Kaposi's sarcoma associated virus (KSHV or HHV8); Hepatitis B and C viruses (HBV and HCV); Human T-lymphotrophic virus 1 (HTLV-1); Merkel cell polyomavirus (MCPyV); and a group of alpha Human papillomaviruses (HPVs). These are listed in Table I. The DNA viruses MCPyV, EBV, KSHV, and a subset of oncogenic alpha HPVs are direct carcinogens that encode oncogenes which are required for maintenance of the tumor phenotype (4); they will be the focus of this article. Fig.  2 shows the vast difference in genome size and coding capacity among these four oncogenic viruses.
One common feature of oncogenic viruses is that they cause very persistent infection of the host and must evade immune detection for very long periods of time (4). The oncogenic herpesviruses have large genomes that encode many proteins to facilitate the viral life cycle, but also to escape immune detection. In contrast, the oncogenic polyoma and papillomaviruses have very limited coding capacity (see Fig.  2) and rely almost completely on interactions with host proteins to fulfill the same functions. Remarkably, these divergent oncogenic viruses target many of the same cellular proteins and pathways to facilitate viral replication and this has provided great insight into the study of oncogenesis.
Advances in Proteomic Studies of Oncogenic Viruses-SV40 and adenovirus were the first intensely studied oncogenic viruses and they were instrumental in the discovery of the major cellular tumor suppressors, p53 and pRb. Although SV40 and adenoviruses do not cause tumors in their natural hosts, they form tumors in rodents (5,6) and this is dependent on SV40 Large Tag and adenovirus E1A and E1B. Pioneering studies by several laboratories in the 1970s and 80s showed that these viral oncogenes bound to the cellular proteins p53 and pRb, and this binding correlated with their ability to transform cells (7)(8)(9). The associated cellular proteins were first noted in co-immunoprecipitates with viral proteins. These early studies relied heavily on highly specific antibodies against the viral tumor antigens and the host interacting proteins (7,8,10). Partial peptide mapping and protein sequencing techniques were also used to compare and identify proteins (8,11), although it was some time before the functions of the host proteins were revealed. There was great excitement when follow up studies showed that the viral oncogenes HPV E6 and E7, SV40 large Tag, and adenovirus E1A and E1B all bound and inactivated pRb and p53 (12)(13)(14)(15)(16).
Thirty years later, under the same principle of the commonality of tumor virus targets, Rozenblatt-Rosen and colleagues undertook an impressive systems virology analysis of cellular pathways perturbed by the human oncogenic viruses EBV, HPV, adenoviruses, and polyomaviruses (17). The authors proposed the Variome to Virome hypothesis that stated that a comparison of cellular pathways perturbed by these viruses should facilitate the identification of driver versus passenger mutations in human tumors. This study identified additional cellular partners of the viral oncoproteins by both highthroughput yeast-two hybrid analyses, and tandem affinity purification (TAP) combined with LC-MS/MS (liquid chromatography tandem mass spectrometry). This was further integrated with a microarray-based transcriptome analysis that compared how each viral oncoprotein perturbed cellular networks. The authors concluded that viral oncogenes and cancer-associated mutations in the host genome converged on common cellular pathways. These studies highlight the remarkable progress that has been made in systems virology in the last few decades.
Early comparative proteomic studies of oncogenic viruses used 2D gel electrophoresis to identify differential protein expression in virally infected cells or tumors, and later, differentially expressed proteins could often be identified by mass spectrometry techniques. The yeast two hybrid technique, and related mammalian two hybrid techniques, have been extensively used to identify interacting cellular factors of oncogenic viruses (17)(18)(19)(20)(21). However, MS based techniques are advancing rapidly, and are used widely, and so studies based on this technology are the focus of this chapter.
Approaches used in Oncogenic Viral Proteomics-There are many different ways to study the viral associated proteome. Sophisticated virion purification methods can provide a highly enriched sample to study protein content and posttranslational modifications. In infected cells, viral and cellular proteins can be defined with a global and unbiased shotgun approach in which all viral and cellular proteins can be identified at a specific time of infection, an intracellular location, or associated with a specific activity (e.g. replication). In a complementary approach, viral interactomes can be defined by determining all interactions among viral and cellular proteins using highly specific protein complex purification techniques. In practice, many of these approaches can be combined to yield important information about viral infection. These approaches are outlined in Fig. 1.
Proteomics of Virion Particles-Proteomics can define the proteins contained in virion particles. The components of the large, enveloped EBV and KHSV virions have been analyzed by LC-MS/MS, revealing 24 -34 viral proteins in the viral capsid, tegument, and envelopes, as well as several host proteins (22,23). In the EBV study, purified virions were further fractionated into envelope, tegument, and capsid containing components (verified by electron microscopy) to further define the structure of the virion (22). Treatment of the virions with deglycosylases helped identify the highly glycosylated proteins associated with the viral envelope (22). Mass spectrometry techniques can also reveal post-translational modifications of virion proteins; Lind et al. identified phosphoproteins in the adenovirus type 2 virion using LC MS/MS techniques but, despite the highly purified nature of virion particles, they had to employ several additional strategies such as TiO 2 enrichment and alternative digestion strategies to identify virion associated phosphoproteins (24). These studies provide important insight into the viral life cycle strategy as virion associated proteins are often required to evade the host intrinsic immune system as well as to initiate the immediate-early viral transcriptional process.  In comparison, the small nonenveloped polyoma and papilloma virions are very simple with just one major and one minor capsid protein (25,26). One unique, and relatively unstudied, feature of these small DNA viruses is that their genomes are packaged in host nucleosomes (27,28). The activity of chromatin is highly modified by extensive posttranslational modifications (PTMs) of histones, which could greatly determine the efficacy of the early stages of viral infection. Early studies on polyomaviruses showed that virion associated histones are highly acetylated (27); this has been confirmed, and many additional modifications identified, by Fang and colleagues who mapped extensive post-translational modifications in the histones of the polyomavirus BKPyV virion particles and minichromosome using triton-acetic acid-urea (TAU)-polyacrylamide gel electrophoresis sepa-ration followed by nanoflow LC-MS/MS analysis (29). Notably, the authors found that N-terminal acetylation of histone H2A occurred only on the viral genome packaged in virions, and not of those in infected cells (29). However, as yet, these studies have not provided functional insight into these modifications. However, it would be surprising if the polyoma and papillomavirus life cycle strategies did not take advantage of epigenetic modulation of virion DNA to facilitate viral infection.
Infected Cells: Viral Chromatin-Global profiling of chromatin modifications by mass spectrometry is challenging, in part because most modifications are located in the highly basic N-terminal tails of the histones. The Garcia laboratory has developed a very detailed workflow that includes propionic anhydride derivatization of lysine residue side chains before trypsin digestion to circumvent this problem. An additional propionylation step targets the N terminus of the digested peptides to help identify these very short peptides by nano LC-MS/MS (30). The workflow also incorporates a TiO 2 binding step to enrich for phosphopeptides. Garcia and colleagues used this method for dynamic global profiling of adenovirus (30) and cytomegalovirus (31) histone PTMs during infection. Mass spectrometry based techniques can provide detailed and unbiased information about global histone post-translational modifications in a virus. However, the availability of highly specific antibodies directed against individual histone modifications enables relatively easy purification of modified chromatin (ChIP) and subsequent identification of the specific sequence of bound DNA by PCR or DNA sequencing technologies (32). At this point, the easy accessibility of ChIP techniques, and the sequence-specific information obtained, make them the method of choice.
Infected Cells: Temporal Studies-Temporal studies can reveal global changes in the viral and cellular proteome at different stages of infection. These studies can characterize the cellular and viral proteome at different stages of infection, or can define the viral interactome for one or more viral proteins. To date, most temporal studies have examined the nononcogenic herpesviruses, HSV1 (herpes simplex virus) and HCMV (human cytomagalovirus). For the oncogenic viruses, it is more difficult to produce large amounts of viral particles, to infect cells synchronously, and to complete the viral life cycle in a short time frame. Furthermore, most early gene products are expressed at very low levels. Oncogenic viruses establish persistent infections and the late stages of infection must be induced by manipulation of the host cell (e.g. differentiation). Quasivirus particles (recircularized viral genomes packaged in a cell line overexpressing the capsid proteins) can be used to generate papillomavirus and polyomavirus particles to study the early stages of infection, and in theory epitope tagged versions of viral proteins could be packaged in these recombinant particles to facilitate their detection and localization (33,34). More efficient methods are also being established to induce the late stages of infection.
Infected Cells: Spatial Studies-Throughout the course of infection, there can be dramatic changes in cellular organelles as well as the formation of viral replication and assembly factories. Subcellular fractionation and analysis of the protein components in these intracellular structures can provide great insight into the infectious process and reveal ways in which viruses manipulate cellular organization. Baquero-Pé rez and Whitehouse took advantage of the fact that KSHV replication and transcription centers are associated with the nuclear envelope and this allowed them to purify, and then identify cellular factors enriched here using LC-MS/MS (35). These studies revealed that the molecular chaperone hsp70 was crucial for the formation of these compartments (35). Bartee et al. used SILAC (stable isotope labeling with amino acids in cell culture) and 2D-LC-MS/MS to compare the proteomics of plasma, golgi, and endoplasmic reticulum membranes of cells in the presence or absence of the KSHV K5 protein, which was known to downregulate MHC class I molecules on the surface of cells (36). Using this approach, they identified three additional immunomodulatory proteins that were underrepresented in the presence of K5 (36). Similar approaches with other oncogenic viruses should reveal a wealth of additional information about how these viruses manipulate cellular function and organization.
Infected Cells: Exosomes-Exosomes are small membrane bound vesicles that are secreted from cells into bodily fluids, and are thought to regulate the cellular microenvironment, particularly when secreted from tumor cells. Exosomes by their very nature contain an easily purified source of proteins. LC-MS/MS analyses of exosomes secreted from EBV and KSHV infected cells have shown that they contain complex mixtures of proteins that are dramatically modulated by viral infection (37). Another study examined the protein content of exosomes secreted from MCpyV positive and negative Merkel cell carcinoma cell lines; using LC-MS/MS, proteins involved in cellular motility and oncogenesis were identified (38). There is strong interest in defining the contents of the exosomes in the quest for tumor biomarkers because they can be isolated noninvasively from body fluids such as saliva.
Infected Cells: Functional Studies-Activity based protein profiling (ABBP) uses highly specific probes that consist of a reactive warhead (that creates an irreversible bond between probe and enzyme), a tag that specifically binds to the catalytic sites of the targeted enzymes, and a reporter that allows their detection or purification (39,40). Comparative ABPP can compare the activity of a class of enzymes in the presence or absence of viral infection. For example, this approach was used to show that both EBV and HPV induced oncogenesis correlates with up-regulation of a series of deubiquitinating enzymes (41,42). To date, there are about 12 different classes of enzymes that can be targeted by ABPP chemistries (43). They hold great promise for viral proteomics, as well as for identifying and optimizing anti-viral therapeutics that can bind to the active site of viral or host enzymes (39).
Advances in click chemistry have also allowed the development of small, cell permeable activity based probes that are highly active in the intracellular environment (44). Cortez and colleagues developed a technique to identify proteins at replication forks; it is named Isolation of Proteins on Nascent DNA or iPOND, coupled with mass spectrometry (45). This technique uses the nucleoside analog 5-ethynyl-2-deoxyuridine (EdU), which is incorporated into nascently replicated DNA, and click chemistry, to define the replisome. Dembowski and DeLuca used this technique to identify both host and viral factors located in replication centers of HSV1 (46), and it is likely that this technique will prove useful for the study of oncogenic viral replication. Many viruses manipulate the DNA damage and repair response to replicate their own DNA (47,48) and the iPOND technique could help determine which of these factors are utilized at the replication fork (46).
Infected Cells: Viral Interactomes-Shotgun proteomics can classify the global proteome at specific stages or locations of infection in an unbiased fashion. However, almost all biological processes and pathways function through proteinprotein interactions (PPIs) and so many targeted approaches have been developed that can be used to identify viral-viral and viral-host interactomes (49). These approaches have been especially fruitful in the study of oncogenic viruses, which have a long-term association with the host, and many key cellular regulatory proteins have been identified because they are targeted by oncogenic viral proteins.
High throughput protein complementation assays such as the yeast two hybrid assay (Y2H) can identify binary interactions between proteins (50) and a number of oncogenic virus interactomes have been developed using this or derivative techniques. For example, extensive and comparative interactomes have been identified for the E2, E6, and E7 proteins from different HPV types using both yeast and mammalian complementation assays (51,52). The Y2H approach was also used to define a global viral-host interaction network of all KSHV proteins (53), and for EBV (21). These studies showed that viral networks tended to appear as single, highly coupled modules (53).
In the last few years, the combination of affinity purification and mass spectrometry (AP-MS) has dominated the viral proteomics field. Viral protein complexes can be affinity purified and interacting proteins rapidly identified by LC-MS/MS techniques. These viral proteins can be expressed alone (fused to high affinity tags) or expressed from the viral genome in the context of an infection. Fluorescent tags enable the location of the viral protein to be monitored in living cells and correlated with protein interactions at various times post infection (54). For example, White et al. identified and compared the interaction partners of a series of E6 proteins from different HPVs using AP-tandem MS (55). By comparing E6 proteins from sixteen different HPVs, the authors could distinguish protein partners of the alpha-HPV E6 proteins (the ubiquitin ligase E6AP) from those of the cutaneous beta-HPV E6 proteins (MAML and associated Notch proteins) (55). Proteomic profiling of EBV EBNA1 by AP-LC-MS/MS defined protein interactions in EBV-associated cancers in both latent and lytic infection (56). Similar studies using KSHV LANA, showed that the RFC complex interacted with LANA and is important for viral replication (57). Another study defined cellular proteins that interacted with LANA, and determined which of these interactions were mediated through the LANA SUMO Interacting motif (SIM) (58).
BioID is a relatively new method that enables the identification of both proximal and interacting proteins in living cells (59,60). The target protein is fused to a biotin ligase that, in the presence of excess biotin, will biotinylate adjacent proteins. The biotinylated proteins are purified by affinity methods and identified by standard LC-MS/MS techniques. This method is particularly useful in identifying transient interacting partners, or protein complexes that are difficult to extract intact from cells. Although not yet used for an oncogenic viral protein, Ortiz et al. used BioID to identify both viral and cellular interacting partners of the HCMV tegument protein pUL103 (61). Of note, they identified the ESCRT-associated protein, ALIX as a binding partner and discovered a previously unidentified ALIX binding domain in pUL103 (61).
Viral interactomes do not necessarily require that a viral protein be the bait. Si et al. used the KSHV terminal repeat DNA as an affinity ligand and identified 123 bound proteins, mostly present in KSHV infected cells, using LCQ-MS (62).
Infected Tissues: Differentiation Dependent Viral Life Styles and Oncogenesis-Shotgun proteomics of virally infected tissues and associated cancers allows the unbiased identification of proteins and has the potential to discover novel therapeutic targets, pathogenic virus signatures, or biomarkers (63). For example, Malik et al. provide a comprehensive review of candidate biomarkers, discovered by a wide range of proteomic techniques, in tissues and saliva from individuals with oral squamous cell carcinoma (64). However, the proteome of undissected tissue biopsies can only give a general overview of protein content, as tumors and infected tissues contain a heterogeneous mixture of cells and stroma. Identification of protein differences between different and specific cell populations requires a combination of highly sophisticated microdissection techniques and ultrasensitive proteomics analysis.
The oncogenic DNA viruses form a long-term, complex relationship with the host. For example, EBV can infect both oral epithelial cells and B-lymphocytes, and viral infection is thought to transition between both cell types for the life of the host (65). HPVs infect the basal cells of a stratified epithelium and establish a relatively quiescent, but long term, infection in these proliferating cells; high levels of viral DNA synthesis, transcription and protein expression are switched on as the infected cells differentiate and traverse to the surface of the epithelium (66). In HPV-associated oncogenesis, the infected cells acquire characteristics that enable them to resist differentiation signals and proliferate continuously throughout the full thickness of the epithelium (66). Further, genetic changes promote invasion through the basement membrane (66). In each of these scenarios, different cells within the tissue are supporting different stages of viral infection or oncogenesis. The spatial distribution of infections and cancer progression in tissue make the technique of laser capture microdissection (LCM) ideally suited to these situations as it allows biological materials (DNA, RNA, proteins) to be directly isolated from cells of interest within a tissue (67).
Most studies to date have used LCM to study the progressive stages of virus-associated cancer. Some of the earliest studies used LCM, 2D gel electrophoresis and mass spectrometry to identify protein changes among normal, premalignant, and cervical cancer tissues (68), or oral squamous cell carcinoma (69). The latter study was able to detect differences in the proteomes of HPV positive and HPV negative cancers (69). LCM followed by mass spectrometry can measure cell type specific protein expression in the tumor microenvironment, and this approach has shown that the stroma of EBV associated nasopharyngeal carcinomas overexpress periostin compared with normal stromal tissue, and this correlates with clinical stage and outcome (70). An important study used LCM and proteomic analysis to demonstrate that the CD21 EBV receptor was only expressed on tonsil epithelial cells (71). A quantitative proteomic analysis of Merkel Cell carcinoma indicated that the MAPK, PI3K/Akt/mTOR, Wnt, and apoptosis signaling pathways were involved in the development of the tumor (72). As mass spectrometry techniques become more sensitive, studies such as these will be able to use proteomics to gain direct information about infection and disease states.
Viral Proteomics in the Clinical Forum-There is intense interest in the use of proteomics to diagnose disease (63, 73). As described above, proteomics is being used to identify biomarkers, pathogenic viral signatures, and cellular pathways that are dysregulated by viruses. However, an important goal is to identify highly specific biomarkers that enable clinicians to diagnose, stage and predict the outcome of different disease states, and to develop highly sensitive and accurate proteomic methods to detect these markers (74).
Of particular interest for oncogenic viruses are biomarkers that could be used for early detection in bodily fluids such as saliva. The Pap smear enables early histological detection of cervical abnormalities in HPV associated cancers, but there is no equivalent screen for abnormalities in the oropharynx or nasopharynx, the sites of HPV and EBV associated cancers. Noninvasive screening for biomarkers, or molecular signatures, in the salivary proteome could allow early detection of these diseases. However, some of the techniques employed have not been reproducible (75) and at this point, very few biomarkers have been successful in the clinical arena (76).
Comparative Proteomics: Evolutionary Proteomics-Comparative proteomics can be very useful in dissecting proteinprotein interactions important for viral-mediated oncogenesis. These studies can either compare the interactome of very closely related viral types that have different oncogenic potential (such as the papillomaviruses), or compare the interactome of oncogenes from different viruses to identify common interactions and pathways.
The papillomaviruses are particularly well suited for phylogenetically driven comparative proteomics. There are over 200 different types of HPV and they all infect similar cell types (keratinocytes), yet have different disease outcomes. Comparative proteomic studies can identify pathogenic viral signatures and help define the key features that make a viral protein oncogenic, as well as identify structural and molecular differences that could be key therapeutic targets. In contrast, polyomaviruses are more diverse (77), their infections are often asymptomatic, and they have multiple tissue tropisms. There have been several studies using AP-MS to compare the interactome of papillomavirus E6, E7 and E2 proteins from distinct phylogenetic groups (17, 55, 78 -80), and these have identified key differences in the interactome of oncogenic and nononcogenic viruses.
Another approach is to analyze the interactome of different oncogenic viruses to identify common targets (17). As described above, the DNA tumor virus oncogenes were instrumental in identifying p53 and pRb pathways as crucial targets for viral-mediated oncogenesis (17). In the systematic study already described above, by comparing the interactomes (defined by TAP-MS and Y2H) and transcriptomes of HPV, polyomavirus, adenovirus and EBV, Rozenblatt-Rosen and colleagues identified that Notch signaling was targeted by all DNA tumor viruses (17).
Technical Advances, Considerations and Challenges-Advances-Virology is an ideal discipline for the use and development of proteomic techniques. Uninfected cells provide a robust negative control, and spatial and temporal changes in the both viral and host proteomes during infection can be tracked, often in concert with live cell microscopy (54,81). Viruses can often be easily manipulated to express tagged viral proteins that are expressed in the context of an infection, and well characterized viral mutations can help unravel the connection between interacting protein partners and function. A comparison of interacting partners among closely related viruses can determine which interactions are required for different disease outcomes and can identify novel therapeutic targets.
Both yeast two hybrid (Y2H) and MS-based approaches have been used in the study of oncogenic viruses and each has advantages and disadvantages (82). However, advances in both the sensitivity and high-throughput capabilities of quantitative MS-based proteomics are likely to make it the method of choice. Advances in instrumentation and technologies will not only improve the precision, sensitivity and speed of MS-based analysis but "plug and play" MS systems will make protein identification an accessible and routine technique for many laboratories not expert in protein chemistry (83).
Powerful techniques such as chemical cross-linking coupled with mass spectrometry (CXMS) can elucidate the higher order of protein macromolecular complexes (84,85) and Chait and colleagues envisage the eventual development of a "multiscale molecular microscope." This would entail the use of chemical crosslinkers in vivo to stabilize complexes, the rapid isolation and identification of proteins using quantitative AP-MS, and further chemical crosslinking ex vivo to map at high resolution the spatial proximities of subunits within a complex (84).
Advances in both CRISPR and chromatin immunoprecipitation technologies have led to CRISPR-based Chromatin Affinity Purification with Mass Spectrometry (CRISPR-ChAP-MS), which can define the epiproteome (86). In this technique, a catalytically inactive, Cas9 protein and complementary gRNA can target, and be used to purify, a specific genomic region of interest along with associated proteins. Such approaches should prove very fruitful in the study of oncogenic viruses, which are dependent on, and take advantage of, host epigenetic mechanisms.
Enrichment, Extraction, and Fractionation-All proteomic approaches require proteins to be efficiently and reproducibly extracted from their original biological materials. However, approaches that require isolation of subcellular compartments or protein complexes must balance efficient extraction of the target with accurate retention of low affinity proteinprotein interactions within complexes. A number of extraction techniques have been developed to overcome this and one example is cryogenic cell lysis in which frozen cell pellets or tissues are ground to a powder in a ball mill before complex purification (87). Cross-linking agents such as formaldehyde can be used to stabilize interactions and allow more stringent washing conditions in affinity approaches and this approach is used in the iPOND technique that isolates proteins bound to nascent replication forks (45,46). The technique of Tandem Affinity Purification (TAP) uses proteins with two epitope tags and protein complexes are purified sequentially using these tags. This approach can remove contaminants but low-affinity, transient interactions are likely to be lost (80,82). A related challenge is the identification and elimination of false positive and false negative interactions identified by AP methods. Techniques such as I-DIRT (Isotopic Differentiation of Interactions as Random or Targeted) can define false interactions (88) and online data repositories can help identify common contaminants (89). Trinkle-Mulcahy and colleagues have defined "the bead proteome," a comprehensive list of proteins that bind to the Sepharose, agarose and magnetic beads used for affinity purification (90). These issues are discussed in detail by White and Howley (82).
Computational Requirements and Challenges-Many of the proteomic studies described here generate enormous datasets and processing and interpretation of the data can be computationally intense. Proteomic data is often a small part of a larger systems virology analysis and robust computational methods are needed to integrate and compare these datasets (17). A recent ambitious study used MS-based proteomics, live cell microscopy and organelle fractionation to analyze spatial changes in the HCMV and host proteome at different times of infection (81). This required intensive machine learning to classify temporal changes in subcellular localization during infection (81).
As described below, it is crucial to share proteomic data but there also needs to be efficient ways to compare proteomic data (both from MS-based and alternative approaches) across different studies, and to integrate these data with that obtained from other systems virology data sets. Many virology research groups can undertake sophisticated proteomic experiments, often in concert with expert mass spectrometry facilities, but they often struggle to find appropriate bioinformatics support to interpret and analyze the data.
Databases: Shared Knowledge and Resources-It is important that open access resources are available to provide reliable bioinformatic information about each virus family to assist in the development of proteomic analyses, and it is also crucial that large proteomic datasets are disseminated and shared to allow others to mine the data using their own tools and expertise. Viral-host interaction databases are also invaluable. A compendium of different resources related to viroinformatics has recently been compiled by Sharma and colleagues (91).
There are also highly valuable resources that help in the interpretation of proteomic data and design of proteomic studies. Highly popular is The Crapome, which is a repository that contains lists of contaminants often found in AP-MS data (89) and COMPASS, an open-source proteomic software pipeline (94). The Pandey lab have developed http://www. silac.org/to assist in the design of quantitative proteomics using SILAC.
It is common, and often required, that genomic sequencing and transcriptome data are deposited in open access online repositories. This is less common for proteomic data but the ProteomeXchange consortium has developed a resource to assist in the submission and access of proteomic datasets (http://www.proteomexchange.org/). As yet, the Proteome-Xchange does not contain many virus-related datasets, but this should improve with time. Additional resources have been reviewed by Perez-Riverol and colleagues (95).
Concluding Remarks-Despite the enormous advances that have been made in viral proteomics over the last few decades, our knowledge of global viral-host interactions is still somewhat rudimentary. Technology is advancing at a rapid pace, and imaginative new techniques are being developed frequently. However, as described here, there are still many technical challenges. Most proteomic experiments are high throughput and discovery-based, and cellular proteins and pathways that are perturbed by viral infection need care-ful validation. Rigor and reproducibility are especially important because of the complex and diverse nature of proteomic studies. Proteomic studies can generate vast amounts of data and highly organized collaborative efforts and computer resources are necessary to analyze, integrate, manage, and make this data publically available. Hanash (2011) describes some of the collaborative efforts and resources required to fullfill the full potential of proteomics in human disease (74).