Shared Molecular Strategies of the Malaria Parasite P. falciparum and the Human Virus HIV-1

We augmented existing computationally predicted and experimentally determined interactions with evolutionarily conserved interactions between proteins of the malaria parasite, P. falciparum, and the human host. In a validation step, we found that conserved interacting host-parasite protein pairs were specifically expressed in host tissues where both the parasite and host proteins are known to be active. We compared host-parasite interactions with experimentally verified interactions between human host proteins and a very different pathogen, HIV-1. Both pathogens were found to use their protein repertoire in a combinatorial manner, providing a broad connection to host cellular processes. Specifically, the two biologically distinct pathogens predominately target central proteins to take control of a human host cell, effectively reaching into diversified cellular host cellular functions. Interacting signaling pathways and a small set of regulatory and signaling proteins were prime targets of both pathogens, suggesting remarkably similar patterns of host-pathogen interactions despite the vast biological differences of both pathogens. Such an identification of shared molecular strategies by the virus HIV-1 and the eukaryotic intracellular pathogen P. falciparum may allow us to illuminate new avenues of disease intervention.

The determination of protein interactions (1)(2)(3)(4) and protein complexes (5)(6)(7)(8) is being increasingly refined in many single and multicellular organisms as well as the human interactome (9 -13). However, little is known about large-scale protein interactions between hosts and pathogens, although the identification of such maps will provide an essential foundation for the development of effective therapeutic and prevention strategies to combat diseases. Uetz et al. reported the first small map of computationally inferred physical protein interactions between the human host and the Kaposi's-Sarcoma associ-ated Herpesvirus as well as the Varicella-Zoster-Virus (14). More recently, Calderwood et al. (15) empirically constructed a map of physical protein interactions between the Epstein-Barr-Virus and the human host. For HIV, subnetworks of virus-host protein interactions that are expressed at different stages of the infection have been identified (16) along with cofactors that enable HIV and the influenza virus to infect a host cell (17)(18)(19). Furthermore, Dyer et al. compared experimentally known interactions of different viruses with the human host (20). The first small map of experimentally determined protein interactions between the human host and the malaria parasite P. falciparum was released recently (21), and was preceded by computational predictions of host-parasite interactions (22,23).
Using large-scale sets of intraspecies protein interactions between proteins of Homo sapiens and the malaria parasite Plasmodium falciparum, we inferred conserved host-parasite interactions using orthologous protein groups. Combining these interactions with experimentally determined (21) and previously predicted interactions based on protein structural features (22,23), we pooled a large set of host-parasite interactions. Although completely different in their biology and evolution, we find that both HIV-1 and P. falciparum use their protein repertoires in a combinatorial manner. Because a similar strategy evolved independently in each case, it may highlight and potentially provide a broad and effective connection to host pathways and diversified cellular functions that may lead to a more stable intercalation of the host cell.

EXPERIMENTAL PROCEDURES
Intraspecies Protein Interaction and Pathway Data-We used human protein-protein interaction data from large-scale high-throughput screens (13,24,25) and several interaction databases (26 -29) totaling 93,178 interactions among 11,691 human genes. For the parasite P. falciparum, we collected 2743 interactions between 1299 proteins (30). To examine human protein pathways, we utilized 184 annotated signaling pathways from the PID database (31).
Host-Parasite Protein Interaction Data-We used 444 experimentally obtained interactions (21) between proteins of the human host and the malaria parasite P. falciparum. In addition, we used 691 computationally determined protein interactions from protein structures of the human host and the parasite (23).
Human-HIV-1 Protein Interaction Data-We used a compilation of 702 experimentally obtained protein interactions between the human host and HIV-1. In particular, we accounted for interactions that have been found in vital cells in the human immune system such as helper T cells, macrophages, and dendritic cells (32).
Parasite Interacting Proteins-We compiled a list of 1014 proteins in P. falciparum that have molecular features, which potentially facilitate interactions with human host proteins. Specifically, we chose proteins carrying a host cell-targeting signal, allowing them to cross into the human host cell by passing several membranes (33)(34)(35). In addition, we picked proteins with trans-membrane domains, signaling peptides located in the red blood cell membrane (36). Finally, we augmented our set with proteins that have been implicated in hostpathogen interactions, pathogenesis, and defense response as well as those proteins that are located in the host cell as suggested by their corresponding GO annotation or PLasmoDraft prediction (37)(38)(39).
Host Protein/Gene Expression Data-From a mass-spectroscopic proteome analysis of the human red blood cell (40), we selected 1,578 human proteins. In addition, we pooled a set of 19,031 human proteins that were expressed in human liver tissue (41).
Parasite Protein Expression Data-We used a list of 838 proteins that are predominantly expressed in the merozoite stage, whereas we used 1038 parasite proteins that are detected in the sporozoite stage (36) of the parasites life cycle. In both studies the presence of proteins in the corresponding proteomes of the underlying cells was determined by mass-spectroscopy.
Orthologous Proteins-Using all-versus-all BLASTP searches determined by the InParanoid script (42) in protein sets of two species, sequence pairs with mutually best scores were selected as central orthologous pairs. To enhance quality, we only accepted BLAST matches with a score of Ͼ40 bits, covering at least 50% of the longer sequence. Proteins of both species that showed such an elevated degree of homology were clustered around these central pairs, forming orthologous groups. The quality of the clustering was further assessed by a standard bootstrap procedure. We only considered the central orthologous sequence pair that provided a confidence level of 100% as the real orthologous relationship, allowing us to obtain 1370 orthologous protein pairs in P. falciparum and H. sapiens.
Enrichment-To estimate whether nodes shared a certain feature of possible biological relevance (e.g. being a parasite target), we calculated the corresponding fraction of such nodes by f a,Նk ϭ ͉N a,Նk ͉/͉N Նk ͉, where each node in N Նk had at least a certain number of neighbors k in an underlying network of interactions. As a null hypothesis, we assumed a random distribution, f a,Նk r , being the random fraction of feature a among all nodes that had at least k neighbors and defined E a,Նk ϭ f a,Նk /f a,Նk r as the enrichment of feature a. Averaging E over 10,000 randomizations we considered sets enriched with feature a if E Ͼ1 and vice versa.
Kernel Density Function-A simple way to analyze a series of values x ϭ x 1 ,. ., x n would be a histogram. However, if the number of observations is low the significance of a histogram is rather limited. Therefore, we defined the kernel density approximation, a smoothing operation that allowed the estimation of a putative probability density function of data points around a certain point x as

RESULTS
To predict protein-protein interactions between human host and malaria parasite proteins, we assembled a network of 93,178 interactions between human proteins (13, 24 -29) and a web of 2743 interactions between parasite proteins (30). Using InParanoid database (42), we compiled 1370 orthologous pairs of human and parasite proteins. We identified a candidate host-parasite interaction in the web of human interactions if a protein had a parasite homolog. Analogously, we determined a potential interaction in the parasite interaction network if a parasite protein had a human ortholog, revealing a total of 98,224 conserved host-parasite interactions.
Although this set of interactions included 1137 parasite proteins we assumed that only a subset of these parasite proteins indeed interact with the human host. Specifically, malaria parasites export remodeling and virulence proteins to the erythrocyte surface after invasion of a red blood cell. Recent studies independently uncovered a host-cell targeting signal that participates in the process of moving proteins across membranes to the host cell membrane (33)(34)(35). We further augmented this set with proteins that have transmembrane domains, signaling peptides and/or are located in the red blood cell membrane (36). Furthermore, we added proteins that have been implicated in host-pathogen interactions, pathogenesis, defense responses, or are located in the host cell as suggested by their corresponding GO annotation or PlasmoDraft prediction (37)(38)(39). In total, we compiled a list of 1014 parasite proteins that potentially interact with human host proteins. If host-pathogen interactions were conserved, we expected many interacting parasite proteins would have molecular characteristics that make them accessible to host proteins. Indeed, we found a small but significant overlap (Fig.  1A, p Ͻ 10 Ϫ10 , hypergeometric test) between the set of accessible parasite proteins and those identified in our network, suggesting that the set of evolutionarily conserved interactions were a result of selective pressure. Compiling a list of 1014 parasite proteins that potentially can interact with human host proteins, we find a small but significant overlap (p Ͻ 10 Ϫ20 , hypergeometric test), suggesting that the set of evolutionary conserved interactions are enriched with potentially interacting parasite proteins. In the upper panel of (B) we assumed that host-parasite interactions predominantly appeared between proteins that are expressed in human liver tissue and the sporozoite stage of the parasite, allowing us to find 7366 interactions (dotted line). Randomizing the sets of expressed genes we confirmed the significance of this observation (p Ͻ 10 Ϫ3 ). In the lower panel we determined the number of interactions between proteins that are expressed in the human red blood cell and the parasitic merozoite stage. Analogously, we found that 7967 interactions (dotted line) were statistically significant, randomizing the sets of expressed genes (p Ͻ 10 Ϫ3 ).
Given that the malaria parasite infects both human liver cells and erythrocytes, we hypothesized that conserved interactions might be enriched with interactions between proteins that are expressed in the liver tissue (41) and the sporozoite stage of the parasite (36) as well as the erythrocytes and parasite blood stages. We found that the presence of 7366 interactions between simultaneously expressed host and parasite proteins was highly significant (p Ͻ 10 Ϫ3 ). In the lower panel of Fig. 1B, we determined the number of interactions between proteins that were simultaneously expressed in the red blood cell (40) and the merozoite stage (36). Analogously, we found 7967 significantly placed interactions (dotted line, p Ͻ 10 Ϫ3 ), confirming that the co-expression of proteins in conserved interactions is a highly nonrandom event.
To establish potential evolutionary conserved interactions, we selected all 250 interactions that involved parasite proteins, carrying molecular features that make them accessible to interactions with human host proteins. In addition, we demanded that the interacting parasite proteins were either expressed in the sporozoite or merozoite stage, whereas their human interaction partners had to be present in human liver or the red blood cell, respectively. Randomizing such sets of expressed genes and accessible parasite proteins, we found that the emergence of such a set of conserved interactions is a highly nonrandom event as well (p Ͻ 10 Ϫ3 ). Finally, we augmented our set of conserved interactions with experimentally determined host-parasite interactions (21) and interactions that were computationally inferred from protein structure data (23). In total, we assembled 1385 interactions between 379 parasite and 823 host proteins, where the majority of interactions were provided by computationally predicted interactions, followed by experimental observations and conserved interactions ( Fig. 2A, supplemental Table S1).
We were interested in exploring features of host-pathogen interactions shared between pathogens as diverse as a virus and a eukaryotic parasite. As a counterpart to the parasite P. falciparum, we chose the human virus HIV-1 and collected 702 physical interactions between 17 HIV-1 and 522 human proteins (32). Considering topological integrity, we found that the bipartite network of interactions between parasite and host proteins was composed of 104 disconnected subnetworks. Notably, the largest subnetwork contained the vast majority of parasite and host proteins. By comparison, a bipartite network of interactions between viral and human host proteins was composed of one such subnetwork. If the placement of host-pathogen interactions were a random process, we would expect that randomly assembled bipartite networks would break into many disconnected parts. By randomly connecting parasite and human host proteins, we indeed observed that randomized networks on average broke into more than observed connected components (p Ͻ 10 Ϫ4 ). Analogously, we rewired HIV and human proteins and found a similar result (p Ͻ 10 Ϫ4 ), confirming that the integrity of a host-pathogen interaction network is the result of a significantly nonrandom process.
Comparing the characteristics of interactions with the human host, we calculated the number of pathogen proteins that target a host protein according to our data. We generally found that the majority of host proteins interacted with a low number of pathogen proteins and vice versa (Fig. 2B). In the inset, we counted the number of human proteins that interacted with the parasite and the virus and found a small but significant set of 56 proteins (Fig. 2B, p Ͻ 10 Ϫ10 , hypergeometric test). Assuming that important proteins strongly interact we expected that hubs might predominantly be targets of pathogen proteins. Using a network of human interactions, we determined the enrichment of all human proteins that were targeted by P. falciparum. We found that especially highly connected proteins are indeed affected by the parasite, a result that was confirmed by interactions between human host and HIV proteins (Fig. 3A). Such trends were reinforced when we considered the enrichment of simultaneously targeted 56 proteins, suggesting the presence of core that is exploited by pathogens to seize control of the host cells. Protein pathways provide an additional level of systems information to leverage in assessing patterns that show the ways pathogens exploit the host cell. Our approach relied on the strength of 184 manually curated signaling protein pathways from the Pathway Interaction Database (31). We hypothesized that the comparably small number of targeted human proteins would allow pathogens to interact with a larger number of pathways than would appear by chance. Out of 184 pathways, human proteins targeted by the virus were part of 168 pathways, whereas the malaria parasite touched 158 pathways. Randomizing sets of attacked human proteins, we found significantly fewer pathways than expected by chance alone (Fig. 3B, p Ͻ 10 Ϫ5 ). In turn, interactions involving 56 human proteins that were attacked by both pathogens targeted human host proteins in 107 pathways. By randomizing FIG. 2. In (A) we augmented our set of evolutionarily conserved set with experimentally determined and protein structure inferred host-parasite interactions. In a total of 1385 interactions between 379 parasite and 823 host proteins, most interactions were inferred from structural protein data, followed by sets of experimentally and evolutionarily determined interactions. B, Comparing the characteristics of interactions between the human host, the virus HIV-1 and the parasite P. falciparum we calculated the number of pathogen proteins that targeted a host protein. We generally found that the majority of host proteins interacted with a low number of pathogen proteins and vice versa. In the inset, we counted the number of human proteins that were targeted by both parasite and virus proteins and found a small but significant overlap (p Ͻ 10 Ϫ10 , hypergeometric test).
the human sets, we determined that this observation was statistically significant (p Ͻ 10 Ϫ5 ). Given the nonrandom points of interaction, the intersection of attacked human proteins potentially allows the broadest possible reach into host pathways, suggesting that the parasite may diversify its contact at the host-parasite interface.
In Fig. 4, we illustrate the level of pathway crosstalk between 184 signaling pathways. Specifically, we counted the number of proteins that a pair of pathways shares by constructing a symmetric matrix. Applying Ward clustering, we notably observed a cluster of 50 prominent signaling pathways that strongly share proteins. Assuming that targeted human proteins allow the pathogens the broadest possible reach into host functions, we expected that this group of pathways would be specifically attacked. Applying Fisher's exact test we found 99 pathways enriched with proteins that were targeted by HIV-1 (p Ͻ 0.05) whereas 49 pathways significantly harbored parasite targeted proteins (p Ͻ 0.05). FIG. 3. A, Determining the enrichment of all human proteins that were targeted by P. falciparum, we found that especially highly connected proteins were predominantly affected by pathogens. Focusing on proteins that were targeted by HIV we found a similar result. Considering the set of proteins that both pathogens target, we observed an amplification of the pathogens specific enrichment signal. In (B) we counted the number of pathways that pathogens reached by targeting corresponding proteins (dashed lines). Randomizing the set of targeted proteins, we found that the choice of attacked proteins is highly nonrandom (p Ͻ 10 Ϫ5 ), a result we confirmed by focusing on proteins that interacted with both pathogens .   FIG. 4. Using 184 signaling pathways, we determined the number of overlapping proteins in all pairs of pathways. Applying Ward clustering, we observed a cluster of prominent signaling pathways that strongly shared proteins (boxes). Determining pathways with proteins that were enriched with proteins targeted by the virus and parasite, respectively (Fisher's exact test, p Ͻ 0.05), we found that groups of strongly cross-talking pathways are predominately enriched with targeted proteins.
Qualitatively, we indeed observed that interacting pathways were predominantly enriched with targeted proteins, suggesting that strongly cross-talking pathways are prime targets of pathogens.
Focusing in detail on 56 human host proteins that were targeted by both pathogens we constructed a bipartite matrix that indicated if two pathogen proteins interact with at least one common host protein (Fig. 5A). In addition, we mapped all interactions between pathogen and host proteins if a host protein shared interactions with both the virus and the parasite. Qualitatively, human proteins that were targeted by both pathogens were involved in interactions with prominent signaling proteins (Fig. 5B). Moreover, we observed that key HIV-1 proteins Vif and Tat as well as several P. falciparum proteins interacted with several components of the human proteasome, a protein assembly that is important in both the red blood cell (44) and T-cells (45). For example, the proteasome component PSMB1, interacting with both Vif and the malaria parasite protein (PFA0110w) is up-regulated in sickle cell disease (43), a condition known to influence malaria parasite infection. In human T-cells, the proteasome is part of an innate anti-HIV defense that is circumvented by the viral proteins Vif and Tat (44). DISCUSSION Detecting sequence homology of interacting proteins is a widely used approach to identify evolutionary conserved interactions. Although sequence homology is a powerful technique, large inserts might obscure homology signals in gene and protein sequences of P. falciparum that might hamper the FIG. 5. A, Using 56 proteins that were targeted by both the virus HIV-1 and the parasite P. falciparum we constructed a bipartite matrix. Specifically, we indicated a link if a pair of a virus and parasite proteins shared at least one common host protein. Specifically, we observed that Vif, Tat, matrix, gp160 and Nef shared the majority of targeted proteins with the parasite. B, On a qualitative basis, we mapped interactions between pathogen and host proteins if a host protein shared interactions with both the virus and the parasite. We observed a large assembly of interactions with proteasomal proteins and viral proteins Tat and Vif. Although the parasite interacted with proteasomal subunits as well interactions were distributed among many different parasite proteins. In addition, we found numerous signaling and regulation proteins that interacted with a variety of parasite and viral proteins. detection of homologs in different organisms. To account for such constraints, we utilized orthologous groups of proteins that were determined in very conservative ways. Although we imposed a strict threshold of significance for each match, we also demanded that the potential match covers more than half of the underlying sequences. After checking their nonrandom nature, we only chose bona fide orthologous pairs that allowed us to infer evolutionary conserved interactions.
Investigating the power of evolutionary conserved interactions, we checked if they carry any relevant signals, indicating the presence of evolutionary pressure. Indeed, we found that conserved interactions enriched parasite proteins with molecular characteristic that make them accessible to interact with a human host protein. In addition, such interactions showed strong nonrandom co-expression of parasite and host proteins in the respective cell-cycle stages and invaded host tissues, strongly suggesting that interologs reflect strong evolutionary pressure.
Pooling conserved interactions with other sets of experimentally and structurally determined interactions, we established a set of interactions between selected parasite and human host proteins. Subsequently, we compared such hostparasite interactions to an experimentally determined set of interactions between the human host and the virus HIV-1.
Although profoundly different in their basic biology, we found striking similarities between interaction patterns of the virus and the malaria parasite. Both HIV-1 and P. falciparum invoked intricate, yet surprisingly similar processes of interaction with a remarkably low number of proteins to exploit the human host cell. In particular, the subtle structure of the human interactome revealed sites that were not only topologically important on their own, but also were significant pathogen targets. The functional and topological role of central proteins allows the host to maintain a complex system with relatively fewer proteins. Because highly interacting proteins are prime targets of a single pathogen, they may be key players in the subtle molecular strategies a pathogen employs in invading and establishing an infection in a host cell. This trend was reinforced by host proteins targeted by both pathogens suggesting a core of functions that allow a pathogen to stably intercalate with a host cell. Tapping such a topological feature with a limited proteomic repertoire in an economic, yet effective way, the malaria parasite and the virus use combinations of pathogen proteins to interact with a variety of different functions. Therefore, untangling the intricate web of host-pathogen interactions is essential to thoroughly understand their pathogenesis. Furthermore, the analysis of hostpathogen interfaces might present a molecular and functional Achilles heel that could be exploited for new interventions to limit the pathogens in a systematic way (45).
We identified malaria parasite specific counterparts to the key players of viral infection, a surprising observation that implied an evolutionary convergent nature of infection by vastly distinct pathogens. On a different note, we found par-asite proteins and targets that could contribute to the pathogenicity of the parasite and suggest testable hypotheses about P. falciparum biology for which future experimentation could reveal possible exploitable features. In this way, webs of well-defined host-pathogen interactions can serve as evolutionary blueprints of pathogenic interference that suggests avenues guiding future efforts to eradicate pathogens that plague human kind.
Acknowledgments-This work was partly supported by NIH Grants AI055035 and AI071121 to MTF. □ S This article contains supplemental Table S1. ¶ To whom correspondence should be addressed: National Center of Biotechnology Information, National Institutes of Health, Bethesda, MD 20892. Tel.: ϩϩ1 301 402 9657; E-mail: wuchtys@ncbi.nlm. nih.gov.