Comparative interactomics for virus–human protein–protein interactions: DNA viruses versus RNA viruses

Viruses are obligatory intracellular pathogens and completely depend on their hosts for survival and reproduction. The strategies adopted by viruses to exploit host cell processes and to evade host immune systems during infections may differ largely with the type of the viral genetic material. An improved understanding of these viral infection mechanisms is only possible through a better understanding of the pathogen–host interactions (PHIs) that enable viruses to enter into the host cells and manipulate the cellular mechanisms to their own advantage. Experimentally‐verified protein–protein interaction (PPI) data of pathogen–host systems only became available at large scale within the last decade. In this study, we comparatively analyzed the current PHI networks belonging to DNA and RNA viruses and their human host, to get insights into the infection strategies used by these viral groups. We investigated the functional properties of human proteins in the PHI networks, to observe and compare the attack strategies of DNA and RNA viruses. We observed that DNA viruses are able to attack both human cellular and metabolic processes simultaneously during infections. On the other hand, RNA viruses preferentially interact with human proteins functioning in specific cellular processes as well as in intracellular transport and localization within the cell. Observing virus‐targeted human proteins, we propose heterogeneous nuclear ribonucleoproteins and transporter proteins as potential antiviral therapeutic targets. The observed common and specific infection mechanisms in terms of viral strategies to attack human proteins may provide crucial information for further design of broad and specific next‐generation antiviral therapeutics.

Viruses are obligatory intracellular pathogens and completely depend on their hosts for survival and reproduction. The strategies adopted by viruses to exploit host cell processes and to evade host immune systems during infections may differ largely with the type of the viral genetic material. An improved understanding of these viral infection mechanisms is only possible through a better understanding of the pathogen-host interactions (PHIs) that enable viruses to enter into the host cells and manipulate the cellular mechanisms to their own advantage. Experimentally-verified protein-protein interaction (PPI) data of pathogen-host systems only became available at large scale within the last decade. In this study, we comparatively analyzed the current PHI networks belonging to DNA and RNA viruses and their human host, to get insights into the infection strategies used by these viral groups. We investigated the functional properties of human proteins in the PHI networks, to observe and compare the attack strategies of DNA and RNA viruses. We observed that DNA viruses are able to attack both human cellular and metabolic processes simultaneously during infections. On the other hand, RNA viruses preferentially interact with human proteins functioning in specific cellular processes as well as in intracellular transport and localization within the cell. Observing virustargeted human proteins, we propose heterogeneous nuclear ribonucleoproteins and transporter proteins as potential antiviral therapeutic targets. The observed common and specific infection mechanisms in terms of viral strategies to attack human proteins may provide crucial information for further design of broad and specific next-generation antiviral therapeutics. Viral infections pose ever-increasing danger to the human beings because of emerging and reemerging diseases. Their high mutation rates enable viruses to easily develop drug resistance towards the conventional therapeutics, which mainly inhibit essential viral proteins. This makes the problem more serious, since most of the current antiviral drugs are hardly effective to the resistant virus strains. Therefore, the efforts on the next-generation antiviral drug discovery have been focused on finding host-oriented drug targets, which act on cellular functions essential in the virus life-cycle [1,2]. The cellular processes are manipulated and exploited by pathogenic microorganisms through mainly physical interactions between pathogen and host proteins [3]. Therefore, viral PHI networks should be investigated thoroughly in terms of the functional properties of virus-targeted host proteins, in order to identify the cellular proteins and associated functions that are indispensable for viruses to replicate and persist within the host cell during infections. The understanding of the differences and similarities in the infection mechanisms used by different viral groups is crucial for designing broad and virus-specific antiviral therapeutics [1,4,5].
Viral families are grouped based on their type of nucleic acid as genetic material, DNA or RNA [6]. DNA viruses contain usually double-stranded DNA (dsDNA) and rarely single-stranded DNA (ssDNA). These viruses replicate using DNA-dependent DNA polymerase. RNA viruses have typically ssRNA, but may also contain dsRNA. ssRNA viruses can be further grouped as positive-sense (ssRNA(+)) or negativesense (ssRNA(À)). The genetic material of ssRNA(+) viruses is like mRNA and can be directly translated by the host cell. ssRNA(À) viruses carry RNA that is complementary to mRNA and must be converted to positive-sense RNA using RNA polymerase before translation. An exception of this group is the Retroviruses, which replicate through DNA intermediates using reverse transcriptase despite having RNA genomes.
Owing to their very small genome sizes, viruses have restricted life capabilities and must enter into the host cells to replicate, assemble and propagate. They have evolved to develop strategies to manipulate host cell mechanisms to control their own life cycles and also to disable antiviral responses of host immune systems [4,7,8]. Compared to DNA virus genomes, which can encode up to hundreds of viral proteins, RNA viruses have smaller genomes that usually encode only a few proteins. Owing to the size and functionality of the resulting proteome, the size and type of viral genetic materials may have great effects on the life styles of viruses within the host cells. DNA viruses have integrated large DNA sequences from the hosts to their genome, throughout the evolution. Consequently, their genomes can encode proteins with eukaryote-originated complex functional domains and enable DNA viruses to finely exploit the metabolism of infected cells in order to promote their own replication within the cell. On the other hand, RNA virus proteins cannot exhibit such homologies with their eukaryotic counterparts, but still can communicate with host cells through complex networks of PHIs. RNA viruses have probably evolved a different strategy, i.e. they interact with the host proteins using protein-binding motifs specific to RNA viruses [9]. Consequently, it can be stated that, DNA and RNA viruses have developed some distinct infection strategies to cause generally chronic and acute infections, respectively.
Despite the availability of detailed models of virus structures, replication machineries, and patho-physiologies, a more functional analysis of virus-host molecular interactions is required to capture a systems view of viral infection mechanisms in host cells. Considering the preliminary efforts on the high-throughput experimental studies [4,[10][11][12][13], one can state that the field of virus-human interspecies protein interactions has been developing now. Nevertheless, the aforementioned systems view of viral infection mechanisms through PHIs is still lacking [2,14]. The general focus of computational analysis of PHI data is on the common and specific behaviors of bacterial and viral pathogens during infections, by comparing their protein interactions with human [3,15]. Vidalain and Tangy (2010) reviewed RNA viruses-human protein interaction networks analyzing special infection characteristics of RNA viruses through 830 PHI data. Pichlmair et al.
(2012) experimentally found 1681 PHI data between 70 viral proteins and 579 human proteins and then comparatively analyzed this dataset in terms of common and specific infection strategies used by DNA and RNA viruses. Here, we comparatively analyzed the current experimental PHI data belonging to DNA and RNA viruses, covering 19 033 PHIs between 1061 viral proteins and 4943 human proteins. The PHI data were obtained from PHISTO which is a comprehensive database of pathogen-human PPIs [16]. This study presents the first comprehensive comparison between DNA and RNA viruses in terms of their infection strategies, providing an initial systems-level understanding of viral pathogenesis through PHIs. However the results drawn from this analysis should be interpreted with caution since PHI data for lots of virus families are still scarce.

Materials and methods
PHISTO: a web-based tool for retrieval and analysis of PHI networks Pathogen-host interaction search tool (PHISTO) was developed by our group due to the lack of a comprehensive PHI database in the Web [16]. It stores the up-to-date PHI data for all pathogen types for which experimentally-found PPIs with human are available. PHISTO also provides integrated bioinformatic tools for visualization of PHI networks and topological/functional analysis of pathogen-targeted human proteins through its user-friendly interface (www.phisto. org). Ongoing studies on PHISTO are for covering experimental PHI data belonging also to other mammalian species as host organism.

Virus-human PHI data
The family-based virus-human PHIs were downloaded from PHISTO using the taxonomic filtering functionality of its browse option. Twenty-eight viral families are covered by the downloaded interspecies interactome data. Twelve families carry DNA and 16 families carry RNA as their genetic material. Retroviruses were excluded from the RNA families since they replicate through reverse transcription. Similarly, Hepadnaviruses were excluded from the DNA viruses. Consequently, PHI data belonging to 11 DNA virus families ( Table 1) and 15 RNA virus families ( Table 2) were obtained and used throughout the study. Two representative PHI networks, one for a DNA virus and one for an RNA virus, are in Fig. 1. All details of the PHI data for these 26 viral families are given in Data S1 and S2.

Human protein sets
A total of eight sets of human proteins interacting with viral pathogens were constructed from the PHI data to analyze the functional properties of virus-targeted human proteins. Firstly, the sets targeted by DNA viruses (DNA viruses-targeted set), RNA viruses (RNA viruses-targeted set), only DNA viruses, i.e. not targeted by any RNA viruses (only DNA viruses-targeted set), and only RNA viruses, i.e. not targeted by any DNA viruses (only RNA viruses-targeted set) were constructed to observe the characteristics specific to DNA and RNA viruses with respect to their interactions with human proteins. For a deeper comparison between DNA and RNA viral infections, human proteins interacting with at least four DNA virus families (4-DNA viruses-targeted set), as well as human proteins interacting with at least four RNA virus families (4-RNA viruses-targeted set) were used. On the other hand, to obtain the common infection strategies of viral pathogens, sets of human proteins targeted by all viruses, i.e. targeted by DNA and/or RNA viruses (virusestargeted set) and the ones targeted by both DNA and RNA viruses together (DNA-RNA viruses-targeted set) were also analyzed. The number of virus-targeted human proteins covered by each set is tabulated in Table 3.

Gene ontology enrichment analysis
Gene ontology (GO) [17] enrichment analysis of the human protein sets was performed using the BINGO plug-in of Cytoscape [18]. The significance level was set to 0.05 meaning that only terms enriched with a P-value of at most 0.05 were considered after an enrichment calculation with hypergeometric test and then Benjamini and Hochberg false discovery rate correction. All three GO terms (biological process, molecular function, and cellular component) were scanned to identify the terms having significant association with each virus-targeted human protein set studied.

Pathway enrichment analysis
Pathway enrichment analysis of the human protein sets was performed using the Web-based tool, KOBAS (ver. 2.0) [19] based on information in KEGG pathway database [20]. In the enrichment process, KOBAS platform uses hypergeometric test and Benjamini and Hochberg correction method. In this study, P-value was set to 0.05 to obtain enriched human pathways for the virus-targeted protein sets.

Virus-targeted human proteins
The visualization of virus-human networks ( Fig. 1) provides some insights on the nature of interactions between the viral and human proteins. In a PHI  )  3  3  3  3  3  Total  61  106  10 051  640  3658 network, few proteins may serve as hub nodes. These are human proteins targeted by lots of pathogen proteins; and pathogen proteins targeting lots of human proteins. This scale-free behavior is observed in the DNA virus-human PHI network to some extent (Fig. 1A). On the other hand, in RNA viruses, very few viral proteins have roles in PHI networks because of their very small genomes. All of these RNA virus proteins usually have lots of interactions with human proteins (Fig. 1B). Therefore, in most cases, the scalefree behavior could not be observed in RNA virushuman PHI networks. In fact, the degree distribution of the virus-human protein interaction networks could not be fitted to any model yet, mostly because of their incompleteness [21], despite some preliminary attempts to model the graph properties of PHI networks [10]. The distribution of 4943 viruses-targeted human proteins based on their attacking virus types can be observed from the number of proteins in the human protein sets (Table 3). A considerable amount of human proteins (1354) are targeted by both DNA and RNA viruses, constituting the common viral targets. All of the virus-targeted human proteins with the number of targeting DNA/RNA virus families can be found in Data S3. Human proteins that are highly targeted by viruses, i.e. targeted by at least total 8 viral families and the targeting virus families, are presented in Table 4. The list includes 21 such human proteins with corresponding targeting viral families.

GO enrichment analysis results
The enriched GO process terms can be used to point out the human processes that are attacked by DNA/ RNA viruses. All enriched GO process, function and  Table 7) are also important to highlight common infection mechanisms shared by the two types of viruses.

Pathway enrichment analysis results
Enriched pathway terms for five specific human protein sets are listed in Tables 8-10, presenting the certain characteristics of DNA and RNA viruses attack strategies. Similar to GO enrichment analysis results, these human protein sets are only DNA virusestargeted and only RNA viruses-targeted (Table 8), 4-DNA viruses-targeted and 4-RNA viruses-targeted (Table 9), and DNA-RNA viruses-targeted (Table 10). Pathway enrichment analysis results are provided in Data S5 for all of eight virus-targeted human protein sets under investigation.

Discussion
Most of the current antiviral therapeutics act for inhibiting specific viral proteins, e.g. essential viral enzymes. Unfortunately, this approach has been ineffective because of drug resistance developed by viruses, especially in the case of RNA viruses which can mutate very rapidly. The next-generation antiviral therapeutics are emerging which target host proteins required by the pathogens, instead of targeting pathogen proteins. If these host factors are indispensable for pathogens, but not essential for host cells, their silencing may effectively inhibit infections without developing drug resistance rapidly [1,21,22]. Another alternative approach is to inhibit the interactions between these host factors and pathogen proteins, instead of targeting the proteins [23]. The development of these novel strategic therapeutic approaches against infectious diseases raises the need for enlightening the infection mechanisms through PHIs, in order to identify putative host-oriented anti-infective therapeutic targets. To understand the complex mechanisms of infections, computational analysis of underlying protein interaction networks may serve crucial insights to develop non-conventional solutions [2,14,24]. This study of computational analysis of virus-human interactomes aims to provide initial insights on the infection mechanisms of DNA and RNA viruses, comparatively, through the observation of the characteristics of human proteins interacting with viral proteins. The common and special infection strategies of DNA and RNA viruses found here may lead to the development of broad and specific next-generation antiviral therapeutics.

Highly targeted human proteins
As the main viral infection strategy, all viruses manipulate cellular processes to proliferate within the host. Therefore, viral proteins highly interact with human proteins functioning in cell cycle, human transcription factors to promote viral genetic material transcription, nuclear membrane proteins for transporting viral genetic material across the nuclear membrane, and also regulatory proteins for translation and apoptosis [3,15,25,26]. We identified human proteins that are highly interacting with viral proteins, sequentially based on the total number of targeting virus families ( Table 4). The list includes the top viral targets which interact with multiple viral families, within the most comprehensive PHI data. Some of these human proteins were previously reported as targets for multiple viruses, i.e. P53, NPM, ROA2, GBLP, and HNRPK [3,15]. Our analyses revealed that there are six heterogeneous nuclear ribonucleoproteins (HNRPs) in the highly targeted human proteins list (HNRPK, ROA1, HNRPC, HNRH1, HNRPF, ROA2). HNRPs are RNA-binding proteins, which function in processing heterogeneous nuclear RNAs into mature mRNAs and in regulating gene expression. Specifically, they take role in the export of mRNA from the nucleus to the cytoplasm. They also recruit regulatory proteins associated with pathways related to DNA and RNA metabolism [27,28]. Being targeted by multiple viruses, HNRPU was reported as a hotspot of viral infection, and proposed as a potential antiviral human protein [4]. In the present study, HNRPU is found to be targeted by five viral families (see Data S3). Our data additionally indicate several other HNRPs, targeted by viral proteins (see Data S1-S3). For all virustargeted HNRPs, the number of targeting RNA virus families is found to be higher than that of DNA virus families (see Data S3), revealing that they may play crucial roles in viral RNA processing. The protein family of HNRPs may serve as host-oriented antiviral drug targets.   Moreover, our analyses also reflected that proteins functioning in transport and localization related processes within the cell are targeted highly by both DNA and RNA viruses, i.e. IMA1, ADT2, TCPG, and TCPE. IMA1 (Karyopherin alpha 2, KPNA2) functions mainly in nuclear import as an adapter protein for nuclear receptor KPNB1 (Karyopherin beta 1). Interacting with IMA1 enables viruses to enter the nucleus and consequently to use the host's transcriptional machinery. Besides, viruses may interact with IMA1 in order to inhibit the host antiviral response, since nuclear import factors regulate the transport of innate immune regulatory proteins to the nucleus of cells to activate the antiviral response [3,[29][30][31]. The transmembrane transporter activity of ADT2 is responsible for the exchange of cytoplasmic ADP with mitochondrial ATP across the mitochondrial membrane, serving crucial roles in metabolic processes [32]. Attacking to human metabolic processes was reported as a common infection strategy of bacteria and viruses [15]. The proteins, TCPG and TCPE are responsible for RNA localization activity and our results reveal that they are targeted by larger number of RNA families (Table 4). Highly targeted transporter proteins should be investigated further for their potential to be next-generation antiviral target, because of their crucial roles in viral life cycle within the host organism.
EF1A1 and EF1A3 function as translation elongation factors in protein biosynthesis. EF1A proteins promote the GTP-dependent binding of aminoacyl-tRNA to the A-site of ribosomes during protein biosynthesis with a responsibility of achieving accuracy of translation [33]. Translation elongation factors were reported as targets for viruses, in early studies [34][35][36]. Since they are essential components of the cellular translational machinery, viruses interact with them for biosynthesis of viral proteins within the host cell. We found translational elongation as the top biological process, commonly targeted by both DNA and RNA viruses (Table 7).
Interacting with human transcription factors was reported as one of the main viral infection strategies [3,15]. Among the highly targeted human proteins, YBOX1 and P53 have transcription factor activity. Both of these proteins are multifunctional. YBOX1 functions in transcription of numerous genes, as a transcription factor. It also contributes to the regulation of translation. On the other hand, P53 is the famous tumor supressor acting as an activator for apoptotic cell death. Apoptosis is a very crucial process during the viral infection progress, and should be strategically controlled by viruses for a successful viral infection. Apoptosis is an innate immune response to viral infection. In the early stage of viral life cycle in the host cell, apoptosis is inhibited by corresponding virus-human protein interactions. After completion of transcription and translation of viral genetic material, viruses try to induce apoptosis to assist virus dissemination [37][38][39].
Among the highly targeted human proteins in Table 4, EF1A1, ADT2, TBA1C, GRP78, TBB5, P53, TCPG, HS90B, and TBA1A were found as drug targets listed in DrugBank [40]. However, only ADT2,  GRP78, TBB5, P53, and TBA1A are approved for commercial drugs. Nevertheless, no antiviral therapeutic usage is available for these drug targets yet. Above-mentioned human proteins; ribonucleoproteins, proteins functioning in intracellular transport and localization, translation elongation factors and transcription factors require further investigation for their potential for serving as antiviral drug targets.

Targeted human mechanisms
Gene ontology and pathway enrichment analyses of pathogen-targeted host proteins are widely used in bioinformatic analysis of PHI networks to understand the attack strategies of pathogens [3,4,15,41,42] as well as in verification of computationally predicted PHIs [43]. Additionally, GO and pathway terms are widely used as features in computational PHI prediction studies [44,45]. Our observation of the enriched GO process terms for human proteins targeted by only DNA viruses (Table 5) may lead to the conclusion that DNA viruses have specifically evolved to be able to attack human cellular and metabolic processes simultaneously, during infections. Using this PHI mechanism, DNA viruses can finely exploit the cellular and metabolic mechanisms of infected cells to their own advantage, generally resulting in chronic infections in human. On the other hand, GO process terms enriched in human proteins targeted by only RNA viruses are mostly related to RNA processing, intracellular transport and localization within the cell (Table 5). It was reported that RNA viruses extensively target human proteins that are involved in RNA metabolism and also protein and RNA transport to promote viral RNA processing for a successful infection [4].
Further investigation of the enriched processes of human proteins attacked by multiple DNA viruses (Table 6) pointed out their high preference to target cellular processes. It was reported that DNA viruses tend to target crosstalking human proteins linking the cell cycle with either transcription or chromosome biology, with a possible aim of promoting viral replication instead of cellular growth [4]. For the RNA viruses, we found that the human proteins attacked by multiple RNA virus families are enriched in specific processes within the cellular mechanisms (Table 6). All viruses need host's transcriptional machinery for viral genetic material transcription.
In the case of human proteins targeted by both DNA and RNA viruses, the P-values of the enriched GO process terms are very low, indicating statistically strong results ( Table 7). The most highly-targeted human process is translational elongation. Translational control of viral gene expression in eukaryotic hosts was reported repeatedly [46][47][48]. Here, we presented translational elongation as the top GO process term enriched in human proteins targeted by both DNA and RNA viruses within the current experimental PHI data. The remaining list includes cellular and metabolic processes, which can be considered as targets of both virus types. Based on these observations, we can state that the common viral infection strategy is to target human proteins functioning within the processes of gene expression and  protein synthesis, simply because of the lack of their own such machineries. All viruses depend on the cellular mechanisms for these processes and they recruit host ribosomes for translation of viral proteins. A comparative investigation of the enriched pathway terms for human protein sets targeted by only DNA viruses and by only RNA viruses (Table 8) reveals additional support for the different infection strategies of these viral groups. There is no common term in these two lists of enriched human pathways. Cell cycle pathway targeted by only DNA viruses and RNA-related pathways targeted by only RNA viruses, provide parallel results with GO enrichment analyses. The enriched pathway terms in 4-DNA viruses-targeted human protein set are only Epstein-Barr virus (EBV) infection and viral carcinogenesis ( Table 9). EBV is a species of DNA virus family Herpesviridae, which constitute nearly half of the DNA viruseshuman PHI data (Table 1). On the other hand, it is estimated that 15% of all human tumors are caused by viruses, mainly DNA viruses, i.e. Herpesviruses and Papillomaviruses [49]. The pathway enrichment analysis of 4-RNA viruses-targeted set brings the terms of protein processing and immune system related terms forward (Table 9). Finally, for the common targets of two virus types, we obtained ribosome term enriched with a very small P-value (Table 10). Both viruses use host ribosome for viral protein synthesis.

Conclusions
In this study, an initial system-level understanding of viral infection mechanisms through PHI networks was pursued by comparing DNA and RNA viruses, aiming to provide a framework for further investigations of infection mechanisms in the light of more precise information on pathogen-host systems in the near future. Ongoing studies and increasing amounts of experimentally-verified PHI data will further improve our understanding of the interplay between pathogens and human and hopefully identify novel and effective therapeutics for infectious disesases.

SD and K €
O € U conceived the study. SD performed the study. SD and K € U € O prepared the manuscript.