Novel Burkholderia mallei Virulence Factors Linked to Specific Host-Pathogen Protein Interactions

Burkholderia mallei is an infectious intracellular pathogen whose virulence and resistance to antibiotics makes it a potential bioterrorism agent. Given its genetic origin as a commensal soil organism, it is equipped with an extensive and varied set of adapted mechanisms to cope with and modulate host-cell environments. One essential virulence mechanism constitutes the specialized secretion systems that are designed to penetrate host-cell membranes and insert pathogen proteins directly into the host cell's cytosol. However, the secretion systems' proteins and, in particular, their host targets are largely uncharacterized. Here, we used a combined in silico, in vitro, and in vivo approach to identify B. mallei proteins required for pathogenicity. We used bioinformatics tools, including orthology detection and ab initio predictions of secretion system proteins, as well as published experimental Burkholderia data to initially select a small number of proteins as putative virulence factors. We then used yeast two-hybrid assays against normalized whole human and whole murine proteome libraries to detect and identify interactions among each of these bacterial proteins and host proteins. Analysis of such interactions provided both verification of known virulence factors and identification of three new putative virulence proteins. We successfully created insertion mutants for each of these three proteins using the virulent B. mallei ATCC 23344 strain. We exposed BALB/c mice to mutant strains and the wild-type strain in an aerosol challenge model using lethal B. mallei doses. In each set of experiments, mice exposed to mutant strains survived for the 21-day duration of the experiment, whereas mice exposed to the wild-type strain rapidly died. Given their in vivo role in pathogenicity, and based on the yeast two-hybrid interaction data, these results point to the importance of these pathogen proteins in modulating host ubiquitination pathways, phagosomal escape, and actin-cytoskeleton rearrangement processes.

Taken out of their natural rhizosphere environment, many members of the genus Burkholderia have proven adept in surviving and adapting to many diverse environments. Of particular interest are the two closely related pathogenic members: Burkholderia mallei, the causative agent of glanders, a disease primarily affecting horses but transmittable to humans; and Burkholderia pseudomallei, which is responsible for melioidosis, a human disease endemic to Southeast Asia and Northern Australia. Human infection of these opportunistic pathogens occurs through ingestion, inhalation, or skin abrasion. Given their considerable antibiotic resistance, ability to infect via aerosol, and absence of vaccines, these pathogens form both an emerging public health threat in their natural environment and a potential bioterrorism threat (1,2). B. mallei, the focus of our study and the less characterized of the two species, is considered a deletion clone of B. pseudomallei, with Ͼ1000 genes lost in an adaptation to equine hosts. Thus, the genes retained in B. mallei share a high sequence similarity to genes common to B. pseudomallei (3), and many virulence factors are common for these two species (4). Some virulence factors present in B. pseudomallei were lost in B. mallei's adaptation to its obligatory intracellular lifestyle in equids. However, the mechanisms and capabilities required for successful host colonization are retained in B. mallei, e.g. quorum sensing (5,6), adhesion (7), capsular polysaccharide gene clusters and lipopolysaccharides (8), actin-based motility (9,10), and a number of different secretion systems (11)(12)(13)(14). Secreted proteins, known as "effector" proteins interact with and hijack critical human proteins and pathways, which allow the pathogen to survive and propagate in the host environment. These pathogen-induced interactions encompass alteration of host cell signaling (15), cytoskeletal modulations (16), ubiquitin modification (17), autophagy suppression (4), and apoptotic/pyroptotic con-trol (18). An apparently distinctive capability of B. pseudomallei and B. mallei is their tendency to form multinucleated giant host-cell complexes as a means for the pathogen to spread using a direct cell-to-cell traversal mechanism (4,9,19). All of these capabilities are ultimately derived from natural adaptations and survival strategies evolved for living in the rhizosphere (20).
Research into the pathogenicity of B. pseudomallei and B. mallei has focused mostly on the bacterial aspect of infectivity, host immune responses, and disease progression, whereas the fundamental molecular host-pathogen interactions that mediate these activities have gathered less attention. Although the transcriptional immune response in the host has been investigated by microarray experiments (21)(22)(23), most of the proteins used by the pathogen to exert its influence are not known. A more comprehensive picture of how the pathogen interacts with host cells requires novel approaches that combine traditional bacteriology, high-throughput experiments, and computational methods. In this study, we used a combined computational and experimental approach to systematically identify B. mallei proteins and their host protein interactions. In particular, we focused on B. mallei proteins that could potentially be translocated into host cells via specific secretion systems. We then screened these bacterial proteins in vitro using yeast two-hybrid (Y2H) 1 assays against both human and murine whole proteomes. A computational analysis of these interactions provided likely functional targets for these proteins. Accordingly, we selected three B. mallei genes previously not tied to virulence, separately constructed insertion mutants for each of the genes, and infected BALB/c mice in an aerosol challenge model with the mutants. Whereas mice infected with the wild-type B. mallei ATCC 23344 strain rapidly succumbed to the infection and died, all animals separately infected with each of the three mutant strains survived the aerosol challenge for the duration of the experiment.
These results implicated three B. mallei proteins, not previously considered virulence factors, to an in vivo virulence phenotype. Two of these proteins are annotated as hypothetical proteins of unknown function, and we used the Y2H data to characterize their putative molecular functions. The apparent role of BMAA0728 relates to ubiquitination and phagosomal escape processes, whereas BMAA1865 activity appears directed toward host actin-cytoskeleton rearrangement processes. The third protein, BMAA0553, a putative phosphatase by sequence homology, is postulated to interfere with host signaling pathways and actin cytoskeleton rearrangement. Based on our current understanding of Burkholderia spp. host infections, these findings shed complementary mechanistic light on B. mallei pathogenicity as it pertains to host cell invasion and survival. They also raise the possibility that B. mallei virulence factors are involved in more than one stage of the bacterial intracellular life cycle. These hypotheses and observations lay the foundation for future biochemical and immunological work.

EXPERIMENTAL PROCEDURES
Comparative Genomics-We used comparative genomics to 1) identify genes that discriminate pathogenic and nonpathogenic strains of B. mallei and 2) search for proteins using orthology with known virulence factors. In these computations, we used a recently published orthology detection method developed by our group, termed QuartetS (24,25), which exploits evolutionary evidence to predict evolutionary related genes in a computationally efficient manner, enabling whole genome comparisons.
Functional annotations of genomes were taken from GenBank and Pathema (26,27). For hypothetical or unannotated proteins, we generated function hypotheses using the algorithms implemented in our integrated protein function annotation software, PIPA (28), which combines multiple programs and databases to provide a unified terminology and consensus function annotation.
Ab Initio Prediction of Type 3 Secretion System Proteins-As secretion system proteins frequently diverged beyond recognition by orthology, specialized methods are required for their prediction. Recently, several groups have described computational methods to identify type 3 secretion system (T3SS) proteins based on protein sequence information (29 -32), whereas algorithms for the prediction of type 6 secretion system (T6SS) proteins are not available. These ab initio methods for the prediction of T3SS proteins use machine learning techniques to train predictive models on sets of known secreted proteins in a range of bacteria. In this work, we used two such algorithms and reported their consensus predictions. The first method, EffectiveT3 (29), uses amino acid composition and di-and tri-peptide pattern frequency as inputs to a number of different machine learning algorithms, whereas the second method, developed by Lower and Schneider (30), uses a sliding window of protein sequence as input to neural networks and support vector machine algorithms. For the EffectiveT3 software (29), we used a probability cutoff of 0.995 for classification as a T3SS protein, whereas for the method described in Ref. (30) we used a 0.6 cutoff.
Y2H Screens to Identify Host-Pathogen Protein-Protein Interactions (PPIs)-B. mallei genes were cloned from B. mallei ATCC 23344 genomic DNA using the Gateway TM Entry system, pDONR™/Zeo (Invitrogen, Carlsbad, CA). In addition to clones of intact genes, 11 putative virulent factors were cloned in more than one truncated form (domain fragments). The oligonucleotides of gene-specific regions for polymerase chain reaction (PCR) amplification were designed using the "OligoCalc: Oligonucleotide Properties Calculator" Web tool (http://www.basic.northwestern.edu/biotools/oligocalc.html). To ensure the specificity of the PCRs, the gene-specific regions were designed at their salt-adjusted melting temperature near 60°C. PCR primers incorporated the following forward and reverse recombination cloning sequences: attB1 (5Ј-GGGGACAAGTTTGTACAAAAAAG-CAGGCTTC-3Ј) and attB2 (5Ј-GGGGACCACTTTGTACAAGAAAGCTG-GGTC-3Ј), respectively. The oligonucleotides were purchased from Invitrogen. The selected open reading frame (ORF) sequences were amplified by PCR in a 96-well format. The PCR amplification was performed in a volume of 50 l containing 2 ng of genomic DNA, 200 nM of each primer, 0.8 mM dNTPs, and Phusion DNA polymerase (New England Biolabs, Ipswich, MA). After denaturation at 98°C for 1 min, the incubation conditions were performed as follows: 25 cycles of 30 s at 98°C, 30 s at 55°C, 1 min at 72°C, and a final 10 min incubation at 72°C.
The amplified ORFs were cloned into pDONR™/Zeo by recombination reactions using the BP Clonase™ II Enzyme Mix (Invitrogen). The reactions were performed with 150 ng of amplified inserts and 150 ng of pDONR™/Zeo in TE buffer (pH 8.0) and incubated overnight at 25°C. The reaction mixtures were incubated with 1 g proteinase K for 30 min at 37°C. Two microliters of the reaction mixtures were transformed into chemically competent DH5␣. Transformants were plated on Luria-Bertani (LB) agar-containing zeocin at a final concentration of 30 g/ml and incubated overnight. Selected transformants were grown in LB medium with zeocin to prepare frozen cultures and for plasmid preparation. The cloned plasmids were validated by Sanger DNA sequencing.
Target genes in entry clones were subcloned into NH 2 -terminal and COOH-terminal DNA-binding domain (bait) Y2H vectors, pGBGT7g and pGBKCg (33), using Gateway LR reactions (34). LR reactions were performed using equal quantities of the entry clone plasmid and the destination vector of 35 fmol in a 10 l volume and 2 l (1ϫ unit) LR clonase II enzyme mix. Reactions were incubated for 16 h at 37°C and terminated by an incubation with 2 g proteinase K for 10 min at 37°C. The destination clones were transformed into DH5␣ and plated on LB agar containing 30 g/ml kanamycin. After bacterial transformation, miniprep plasmid DNAs of all the bait clones were transformed into Y2H yeast strain AH109 (MAT-␣) (35), as previously described in Ref. (36).
Before the two-hybrid analyses, all bait yeast strains were examined for auto-activation, i.e. detectable bait-dependent reporter gene activation in the absence of any prey interaction partner. Weak to intermediate strength auto-activator baits can be used in two-hybrid array screens because the corresponding bait-prey interactions confer stronger signals than the auto-activation background. We used the HIS3 reporter gene and suppressed auto-activation by adding different concentrations of 3-amino-1,2,4-triazole (3-AT), a competitive inhibitor of HIS3. All of the B. mallei baits were inspected for autoactivation on plates containing different concentrations of 3-AT. The lowest concentration of 3-AT that suppressed growth for the interaction screen was used (see below), because it avoided background growth while still detecting true interactions.
A haploid yeast strain expressing a single B. mallei protein as bait was used to separately perform protein interaction screens with human and mouse normalized universal two-hybrid cDNA libraries (catalog nos. 630480 and 630482, Clontech). The bait and prey (cDNA library) yeast culture was grown and mixed (at a 1:1 ratio) and plated on YEPDA agar plates (as recommended by the cDNA library provider, Clontech). YEPDA agar plates were incubated at 30°C for 6 h or overnight at room temperature. During this process, both prey and bait plasmids were combined in the diploid yeast cells by yeast matting. The cells from the mating plates were collected and transferred onto interaction selection yeast synthetic medium with predefined concentrations of 3-AT (media lacking tryptophan, leucine, and histidine plus 3-AT), and plates were incubated at 30°C for 4 -6 days. The interaction selection plates that showed colony growth but no colonies on control plates (bait mated to empty prey vector) were identified as the two-hybrid positive yeast clones. Positive yeast colonies were selected either manually or using robotics and subjected to yeast colony PCR followed by DNA sequencing to identify the interacting preys (36).
Mapping of Host Proteins to Pathways and Networks-All computational analyses were performed in R, using Bioconductor/BioMart (37,38) for GO and KEGG analyses and the "survival" package for survival analysis (Terry Therneau, "A Package for Survival Analysis in S," R package version 2. 36 -14, 2012). All networks were plotted using Cytoscape (39).
Protein-Protein Interaction Data-To identify conserved PPIs between the B. mallei-human and B. mallei-murine PPI data sets, we first identified human-murine orthologs of all human and murine proteins interacting with B. mallei, using the National Center for Biotechnology Information (NCBI) HomoloGene database of homologs (http://www.ncbi.nlm.nih.gov/homologene) (40,41). Then, we denoted host-pathogen interactions in which human proteins interacted with the same B. mallei proteins as their murine orthologs as conserved interactions. For example, the interaction between B. mallei protein BMAA0553 and human protein NPC2 was conserved because BMAA0553 also interacted with murine protein Npc2, and NPC2 and Npc2 are defined as human-murine orthologs in the NCBI HomoloGene database.
To evaluate the influence of the pathogen on host proteins, including the relationships among host proteins targeted by the pathogen's proteins, we mapped the identified host proteins interacting with B. mallei proteins onto a human PPI network (42). The human PPI network consists of 81,226 physical PPIs among 13,382 human proteins. Of the 170 human proteins found to interact with BMAA0728, BMAA1865, or BMAA0553, 125 (ϳ74%) were found in the PPI network. Currently available murine PPIs data are largely incomplete and sparse, so we decided not to map the murine proteins onto a murine PPI network. Instead, we used the NCBI HomoloGene database to identify human orthologs of murine proteins that interacted with B. mallei proteins and mapped the identified orthologs onto the human PPI network. Of the 191 murine proteins interacting with BMAA0728, BMAA1865, or BMAA0553, 166 (87%) had human orthologs listed in HomoloGene. Of the 336 human proteins (170 human proteins ϩ 166 murine orthologs) interacting with three B. mallei proteins of interest, 244 (ϳ73%) were found in the human PPI network.
To assign protein-domain annotation to B. mallei, human, and murine proteins, we used the high-confidence annotations from PFAM-A database (43).
Gene Ontology (GO) Enrichment Analysis-We used the hypergeometric distribution to assess the statistical significance of observing a given GO term enrichment in our data. For a given host (human or murine), we first counted the total number of proteins (N) annotated with any GO term. Then, for each GO term, g, we counted the total number of proteins (m) annotated with g. Next, we calculated the probability (p) of observing the same, or higher, enrichment in term g by chance, as follows: where n represents the number of host proteins interacting with B. mallei proteins annotated with any GO term and k denotes the number of host proteins interacting with B. mallei proteins annotated with g.
For the total number of human proteins, we used the set of all human and murine proteins available in Bioconductor and BioMart (37,38) and annotated with at least one GO term, excluding root GO terms.
In this analysis, we only considered GO annotations that were enriched at a false discovery rate control level of 0.2 or below, after accounting for multiple hypothesis testing using the Benjamini and Hochberg multiple test correction (44).
Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathways Enrichment Analysis-KEGG pathways are not independent of each other, and a large number of pathways share a statistically significant number of genes (45). Thus, analyses that search for statistically significantly enriched pathways may identify multiple pathways as significant simply because they contain a large number of shared genes (e.g. the long-term depression pathway shares ϳ60% of its genes with the vascular smooth muscle contraction pathway) and not necessarily because of shared functionality. As these nonpathwayspecific genes are not necessarily a good representation of a specific biological process, we focused on the identified pathways that are enriched with genes that mainly belong to that pathway or a very small number of other pathways. Thus, to identify gene-specific KEGG pathways, we used the hypergeometric distribution with adjusted parameters to account for the specificity of each gene. For a given host, for each gene i in KEGG, we counted the number of pathways in which the gene appears and denoted this number as k i . We then defined the probability of each gene i being specific (p i ) as follows: Next, we defined the adjusted population size of all host proteins in KEGG (A N ) as follows: where the summation runs over all N KEGG host proteins that appear in at least one KEGG pathway. We then defined the adjusted number of genes in a pathway P (A m ) as: where the summation runs over all n P host proteins that appear in pathway P. The adjusted total number of interacting host proteins found in KEGG pathways (A n ) was defined as: where the summation runs over all n H host proteins that interact with any B. mallei protein and appears in any pathway. Finally, the adjusted number of host proteins interacting with B. mallei proteins in pathway P (A k ) was defined as: where the summation runs over all n HP interacting host proteins in pathway P.
Because the adjusted numbers A N , A m , A n , and A k are real numbers, we converted each of them to its next smallest integer, N, m, n, and k, respectively, and calculated the probability of observing the same or higher specific enrichment in pathway P purely by chance using Eq. 1. For the total set of human proteins, we used all unique human proteins that belong to at least one KEGG pathway.
Bacterial Strains, Plasmids, and Growth Conditions- Table I shows the plasmids and bacterial strains used in this study. E. coli was grown at 37°C on LB (Lennox) agar or in LB broth. B. mallei was grown at 37°C on LB agar with 4% glycerol or in LB broth with 4% glycerol. All bacterial strains were grown in broth for ϳ18 h with constant agitation at 250 revolutions/min. When appropriate, antibiotics were added at the following concentrations: 25 g kanamycin per milliliter for E. coli and 15 g kanamycin and 25 g polymyxin B per milliliter for B. mallei. Phosphate-buffered saline (PBS) was used to make serial dilutions of saturated bacterial cultures, and the number of cfu present in the starting culture was determined by spreading 100 l aliquots onto agar media and incubating for 24 -48 h. A 20 mg/ml stock solution of the chromogenic indicator 5-bromo-4-chloro-3-indolyl-b-D-galactoside was prepared in N,N-dimethylformamide, and 40 l were spread onto the surface of the plate medium for blue/white screening in E. coli TOP10. All manipulations with B. mallei were carried out in a class II microbiological safety cabinet located in a designated biosafety level 3 laboratory.
PCR of Internal Gene Fragments- Table I also shows the PCR primers used to amplify internal gene fragments of BMAA0728, BMAA1865, and BMAA0553. PCR amplifications were performed in a final reaction volume of 100 l containing 1ϫ TaqPCR Master Mix (Qiagen), 1 M oligodeoxyribonucleotide primers, and ϳ200 ng of B. mallei ATCC 23344 genomic DNA. Bacterial genomic DNA was prepared using a previously described protocol (49). PCR cycling was performed using a PTC-150 MiniCycler with a Hot Bonnet accessory (MJ Research, Inc.) and heated to 97°C for 5 min. This was followed by 30 cycles of a three-temperature cycling protocol (97°C for 30 s, 55°C for 30 s, and 72°C for 1 min) and 1 cycle at 72°C for 10 min. PCR products were sized and isolated using agarose gel electrophoresis, cloned using the pCR2.1-TOPO TA Cloning Kit (Invitrogen), and transformed into chemically competent E. coli TOP10.
DNA Manipulation and Plasmid Conjugation-The restriction enzymes EcoRI and NotI, Antarctic phosphatase, and T4 DNA ligase were purchased from Roche Molecular Biochemicals and used according to the manufacturer's instructions. DNA fragments used in cloning procedures were excised from agarose gels and purified with a GeneClean III kit (Q BIOgene). Plasmids were purified from overnight cultures using Wizard Plus SV Minipreps (Promega). Plasmids pEXKm5-0728, pEXKm5-1865, and pEXKm5-0553 (Table I) were electroporated into E. coli S17-1 (12.25 kV/cm) and conjugated with B. mallei ATCC 23344 for 8 h, as previously described (8). Polymyxin B was used to counter-select E. coli S17-1 donor strains.
Animal Experiments-Female BALB/c mice were obtained from Charles River Laboratories (National Cancer Institute) and were 6 -8 week old at the time of use. Animals were provided with rodent feed and water ad libitum and maintained on a 12:12-hour light-dark cycle. Whole body exposures to aerosols of B. mallei were performed as previously described by Roy et al. (50). Research was conducted in compliance with the Animal Welfare Act and other federal statutes and regulations relating to animals and experiments involving animals and adhered to principles stated in the National Institutes of Health Guide for the Care and Use of Laboratory Animals (http://www. nap.edu/readingroom/books/labrats/chaps.html). The United States Army Medical Research Institute for Infectious Diseases (USAMRIID) Institutional Animal Care and Use Committee (IACUC) reviewed and approved the animal protocol entitled "Evaluation of Burkholderia mallei Secretion Inhibitors and Identification of B. mallei and B. pseudomallei Virulence Determinants in Mice and Hamsters." The animal protocol number assigned to this protocol by the USAMRIID IACUC was AP-09 -045. The facility where this research was conducted is fully accredited by the Association for Assessment and Accreditation of Laboratory Animal Care International.
Survival Analysis-We used the Kaplan-Meier method (51) to generate survival curves from the number of surviving mice as a function of time. To this end, Kaplan-Meier survival curves can be interpreted as the probability of animals surviving a given length of time. This probability is defined as the cumulative probability of surviving k time periods [S(k)], as follows: where p i represents the proportion of animals surviving period i. Here, p i was defined as follows: where r i and d i denote the number of alive mice at the beginning of time period i and the number of deaths within period i, respectively. To compare murine survival curves, we used the following log-rank test: where 1 and 2 represent the set of mice infected with B. mallei mutant (set 1) and the set of mice infected with wild-type B. mallei (set 2), r 1i and r 2i represent the number of alive mice from sets 1 and 2 at time period i, respectively, and O 1 and O 2 represent the total number of observed events in each of the two sets, respectively. Biosafety and Biosecurity-We do not anticipate that this manuscript provides knowledge, products, or technologies that could be directly misapplied by others to pose a threat to public health and safety, agricultural crops and other plants, animals, the environment, or materiel. The Institutional Biosafety Committee (IBC) that approved this work is composed of members of the USAMRIID research staff, Commander's office, and qualified representatives from external institutions and is tasked to provide local, institutional oversight of research utilizing recombinant DNA. The USAMRIID IBC was established under the United States (U.S.) National Institutes of Health Guidelines for Research Involving Recombinant DNA Molecules. Figure 1 outlines the components and steps of our combined in silico, in vitro, and in vivo effort to identify B. mallei virulence factors via a systematic analysis of host-pathogen protein interactions. In the first step, we used a set of bioinformatics tools for in silico identification of putative virulence factors, including orthology detection based on pathogen proteins experimentally linked to pathogenicity in other species and ab initio prediction methods. We complemented these methods with literature review of published experimental data, resulting in a pruned list of 49 proteins. In the second step, we screened the selected proteins in vitro for potential interaction partners, using Y2H assays against normalized human and murine whole proteome libraries, and identified host-pathogen protein-protein interactions (PPIs). In the third In Silico Identification of Putative Virulent Factors-To create an initial set of possible virulence factors, we deployed three independent bioinformatics analyses based on the identification of: 1) genes present in pathogenic and absent in nonpathogenic strains of B. mallei, 2) orthologous genes that have been associated with secretion systems in other bacteria, and 3) type 3 secretion system (T3SS)-associated secreted proteins, using amino acid pattern-based ab initio methods. This extensive set of proteins was then examined within the context of literature searches and published microarray experimental data to weigh, rank, and group the collected evidence into a final, pruned set of in silico predicted virulence factors.

RESULTS
In the first analysis, we collected genomic data for B. mallei and related bacterial genomes using Pathema (27) for finished and draft B. mallei genomes and GenBank (26) for all other finished genomes. We used QuartetS (25) to identify orthologous proteins in the seven strains of B. mallei reported by Losada et al. (52). Table II shows the four pathogenic and three nonpathogenic/attenuated strains used in this study. We identified orthologous pairs of genes between each two strains and identified 78 genes that were either present in all pathogenic strains or overrepresented in the pathogenic strains. Thirty-two of these orthologs were present in our B. mallei ATCC 23344 target strain and had no orthologs among the nonpathogenic strains. Even though this set should be enriched with proteins associated with pathogenicity, the bulk of these 32 genes were annotated with hypothetical func- FIG. 1. The combined in silico, in vitro, and in vivo approach. We used bioinformatics tools to identify 49 proteins putatively involved in virulence. These proteins were then screened for interaction partners using yeast two-hybrid (Y2H) assays against whole human and whole murine proteome libraries. We mapped the identified host-pathogen protein-protein interactions (PPIs) to host pathways and host PPI networks. The host proteins interacting with Burkholderia mallei proteins were characterized with respect to their functions and possible role in pathogenicity. This analysis resulted in the identification of a three promising virulence factor candidates. We verified that mutants lacking these proteins showed attenuated virulence in a mouse aerosol challenge model. Finally, we used all obtained data to delineate mechanisms of pathogenicity and generated hypotheses about the potential roles of these three proteins in B. mallei invasion of the host cell.
tions or with functions that could not readily be associated with pathogenicity (supplemental Table S1) and, hence, their actual role in Burkholderia pathogenicity remains to be determined.
In the second analysis, we examined gene clusters associated with bacterial secretion systems (55). In particular, previous experimental work in animal models has indicated that the T3SS (14) and the type 6 secretion system (T6SS) (11,13) are involved in virulence of B. mallei ATCC 23344. Recent publications have also elucidated the role of T3SS (56) and T6SS (11) in the virulence of B. pseudomallei. We used available literature and database information to compile a list of known and predicted proteins associated with bacterial secretion. Then, we used QuartetS to identify 24 orthologs of these proteins present in the B. mallei ATCC 23344 strain. supplemental Table S2 gives the gene locus, name, secretion system, and evidence used to identify this set of proteins.
Finally, in the third analysis, we used two T3SS protein prediction methods to identify B. mallei proteins that share characteristics with known T3SS proteins and, hence, represent potential virulence factors (29,30). We selected proteins that were common to both prediction methods and located on the second chromosome, as the known B. mallei T3SS is located on this chromosome. This analysis identified 37 proteins (supplemental Table S3).
To select proteins for experimental evaluation, we first merged the lists obtained in three independent analyses above, and revised the merged list using published microarray experimental data. In particular, we compiled the available microarray and animal model experiments for B. mallei and the closely related bacteria B. pseudomallei, B. cenocepacia, and B. thailandensis (13,22,48,(57)(58)(59). We identified proteins that were overexpressed during infections and in disease states and, hence, considered relevant for pathogenicity. Next, we used QuartetS to detect orthologs of the overexpressed proteins that are present in the merged list of B. mallei proteins. This procedure provided additional evidence for pathogenicity among of our identified proteins. Finally, we individually assessed and classified each of the proteins and created a final list of 49 potential virulence factors for experimental evaluation. The final list was divided into four groups based on the assessed reliability of evidences. Table III shows the 49 proteins, 18 of which we classified into the most reliable group (group 1). All identified proteins and their associated evidence for involvement in pathogenicity are further detailed in supplemental Table S4.
Y2H Screening Results and Data Analysis-Out of the 49 selected proteins, 43 of the corresponding genes (or gene fragments) were successfully cloned, prepared, and tested in Y2H assays against both human and murine whole proteome libraries. The raw interaction data set consisted of 637 PPIs between human and B. mallei proteins and 900 PPIs between murine and B. mallei proteins. We filtered these data sets by removing protein interactions between B. mallei and "sticky" host proteins known to be indiscriminate binders (F. Schwarz and M. Koegl, unpublished data). The filtered data consisted of 600 unique interactions between human and B. mallei proteins and 846 unique interactions between murine and B. mallei proteins (supplementary Data set S1 and supplementary Data set S2). Fig. 2 shows the host-pathogen protein interactions and Table III lists the number of Y2H interactions we detected for each tested protein. These 1446 PPIs involved 26 unique B. mallei proteins, 409 unique human proteins, and 574 unique murine proteins. Nineteen of the 26 B. mallei proteins interacted with proteins of both hosts, whereas two B. mallei proteins interacted only with human proteins and five B. mallei proteins interacted only with murine proteins. Furthermore, we found 33 conserved PPIs between these two data sets, i.e. interactions in which the human proteins interacted with the same B. mallei proteins as their murine orthologs. Finally, we found that 16 human proteins (9%) and 21 murine proteins (11%) interacted with multiple pathogen proteins, implying that these proteins may be involved in more than one stage of the infectious process and, potentially, have a multivariate pathogenic functionality.
Selection of Proteins for Animal-model Experiments-To identify the likely virulence factors for validation in animal experiments, we focused on the 19 B. mallei proteins that interacted with both hosts. We considered this to be a reasonable down selection criterion because murine animal models of B. mallei infection share many pathophysiological characteristics with human infections (62,63). Thirteen of  BMAA0418 NAD-dependent deacetylase 3 f Yes a Gene located within the loci of a type 6 secretion system (T6SS) clusters (cluster 1 ͓c1͔: BMAA0744-0729, cluster 2 ͓c2͔: BMAA0438-0455, cluster 3 ͓c3͔: BMAA0396-0412, and cluster 4 ͓c4͔: BMAA1897-1915) (13).
b Type 5 secretion system effector (auto-transporter), located adjacent to T6SS c1. c Predicted phospholipase, a putative type 2 secretion system (T2SS) effector (60). d Gene located within the locus of the type 3 secretion system (T3SS) (animal pathogen): BMAA1520-1552 (14). e T6SS-associated effector not located in any known or annotated T6SS cluster. f T3SS-predicted effector not located in any known or annotated T3SS. g Predicted as a T3SS effector, but located adjacent to T6SS c1. h Predicted to be secreted by the general Sec pathway and therefore possibly related to the type 2 secretion system. i Protein located within the locus of the putative plant-pathogenicity-associated T3SS: BMAA1617-1640 (61). these 19 proteins participated in interactions that were conserved between the two different hosts, and eight of them participated in more than one conserved PPI. Three of these proteins, PilA, BipD, and BimA, have been previously characterized as virulence factors in animal models (64 -68). The other five proteins represent targets that have not been previously studied in animal models of glanders or melioidosis, and we considered them as promising proteins for animal model experiments. Because one of these five proteins consists of only 62 amino acids, it was too small for the creation of insertion mutants and was removed from further consideration. Another protein, cytidylate kinase BMA0429, was identified as an essential B. mallei protein [it is an ortholog of STY0980, an essential gene of Salmonella enterica Typhimurium (69)], and, as such, was not likely to be successfully mutated and was not considered further. Thus, we identified BMAA1865, BMAA0728, and BMAA0553 as three relatively uncharacterized and novel B. mallei candidate virulence factors. We initially examined PPIs among these three pathogen proteins and host proteins to identify sets of overrepresented Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways among the host proteins. We found that each of the three proteins interacted with a statistically significant (p Յ 0.02) number of host proteins that are exclusively part of pathways related to ubiquitin processing, receptor signaling, and protein processing in the endoplasmic reticulum, i.e. they interacted with host proteins that are primarily part of these pathways and do not appear in a large number of other pathways (see Experimental Procedures). In summary, we found that each of these three proteins interacted with host proteins involved in biological processes and/or molecular pathways often implicated in pathogenicity and were, therefore, of interest for experimental validation in a murine model of glanders.
Mutants Showed Attenuated Virulence in a BALB/c Aerosol Challenge Model-Based on the above evidence, we created separate insertion mutants in the B. mallei ATCC 23344 strain for each of the three genes (BMAA0728, BMAA1865, and BMAA0553). These mutants did not seem to have any ascertainable in vitro growth defect and appeared fully competent. We subsequently used a mouse aerosol challenge model to assess potential attenuation afforded by these B. mallei mutants. The fully virulent wild-type B. mallei ATCC 23344 strain was used as a positive control.
Forty BALB/c mice (10 mice for each of the three mutant strains and 10 mice for the wild-type strain) were exposed to inhalation doses of Ն10 4 colony-forming units (cfu; corresponding to Ն10 LD 50 for the wild-type strain) and monitored for 21 days. Fig. 3 shows the survival curves of mice exposed to mutant strains and mice exposed to the wild-type strain. For this dose, each of the three mutants appeared to attenuate infection, as all 30 mice exposed to the mutants survived 21 days post-exposure, whereas seven mice exposed to the wild-type strain did not survive. A survival-curve analysis indicated that this difference was statistically significant (p ϭ 4 ϫ 10 Ϫ3 ). However, because not all animals exposed to the wild-type challenge died at the considered dose (Ն10 4 cfu), we repeated the experiment using a higher aerosol challenge dose.
Accordingly, 40 additional BALB/c mice (3 ϫ 10 ϩ 10) were exposed to inhalation doses of Ն 2 ϫ 10 5 cfu (corresponding to Ն20 LD 50 for the wild-type strain) and monitored for 21 days. Similarly, we found statistically significant differences (p ϭ 9 ϫ 10 Ϫ5 ) in the survival rate of mice exposed to mutants and mice exposed to the wild-type strain. All three of the B. mallei mutants appeared to attenuate infection, i.e. all mice infected with the mutants survived 21 days post-exposure, whereas all of the mice infected with B. mallei ATCC 23344 strain died within 7 days post exposure (Fig. 3). These results support the hypothesis that each of the three mutants attains a virulence defect when infected in mice via the inhalational route of infection.

DISCUSSION
In vitro high-throughput technologies have become versatile tools for discovery and hypothesis generation involving proteins and protein interactions that underlie infectious pathogenesis (70 -72). These technologies have been successfully applied to both viral (73)(74)(75)(76)(77)(78) and bacterial (21-23, 79 -82) pathogens to shed light on how these organisms interact with and subvert host defense processes. For exam-ple, Li et al. used a small set of conserved herpesvirusencoded protein kinases to identify key targeted host proteins using human protein microarray experiments (77). The analysis of these interactions identified the DNA damage response pathway and other key host regulators as central targets in sustaining the viral life cycle.
In Vitro Approaches to Study B. mallei Pathogenesis-A key component of B. mallei pathogenesis is its ability to enter, survive, and replicate within the host cells. Secretion systems play an important role in pathogenicity of the Gram-negative B. mallei by allowing for controlled transport and insertion of pathogen proteins into the host cell cytosol. The identities of secretion systems' proteins are not all cataloged, and the interactions between specific B. mallei proteins and host proteins and other host factors are not well characterized. Given that most Burkholderia spp. are opportunistic infections, one cannot expect all secretion system proteins to have evolved host-specific protein interactions. Instead, the bulk of the interactions must encompass generic host target interactions to broadly cope with multiple and variable hosts, e.g. plants, nematodes, and other soil microorganisms, as well as opportunistic infections in mammalian hosts. Even for the equineadapted and, thus, more genetically constrained, B. mallei pathogen, we cannot expect that either human or murine interactions will have high specificity. As Y2H techniques are sensitive to weak interactions, this method is ideally suited to probe interactions nonspecific to a distinct host (83). Although many Burkholderia proteins could potentially interact with multiple host proteins, nonsecreted proteins are less likely to encounter host proteins and, hence, focusing on secretion systems' proteins will increase the likelihood of finding biologically important interactions relevant to establishing the FIG. 3. Mouse aerosol challenge model results. In the first set of experiments, 40 BALB/c mice (10 mice for each of the three B. mallei mutants ϩ 10 mice exposed to the wild-type strain) received inhalation doses of Ն2 ϫ 10 4 colony-forming units (cfu), corresponding to Ն10 LD 50 for the wild-type strain. In the second set of experiments, additional 40 BALB/c mice received inhalation doses of Ն2 ϫ 10 5 cfu, corresponding to Ն20 LD 50 for the wild-type strain. In both experiments, animals were monitored for 21 days. At the end of the first set of experiments, all 30 mice exposed to mutant strains survived 21 days post-exposure (pink, orange, and blue lines), whereas seven mice exposed to the wild-type strain did not survive (green line). At the end of the second set of experiments, all 30 mice exposed to mutant strains also survived 21 days post-exposure (pink, orange, and blue lines), whereas all 10 mice infected with B. mallei ATCC 23344 died (purple line). There was a statistically significant difference in the survival rate of mice exposed to mutant strains and mice exposed to the wild-type strain in both sets of experiments (p ϭ 4 ϫ 10 Ϫ3 and p ϭ 9 ϫ 10 Ϫ5 for the first and second set of experiments, respectively). These results support the notion that each of the three mutants attenuates virulence when infected in mice via the inhalational route of infection.
infection. Our approach was to use in silico methods to specifically search for virulence factors that are likely to be a part of a secretion system and gauge their potential relevance by identifying important host-pathogen protein interactions based on in vitro data. The in vitro Y2H results thus represent an important source for generating hypothesis on B. mallei virulence and pathogenicity.
Y2H Detection of Host Interactions- Table III shows the number of interactions between the tested B. mallei proteins and human and murine proteins as well as the corresponding number of conserved interactions in terms of orthologous pairs of interactions. The small overlap between murine and human prey proteins for which PPIs were identified is potentially an artifact from the Y2H experiments. A comprehensive detection of all possible host-prey and pathogen-bait protein interactions requires exhaustive repetition (84) and, hence, our detected PPIs represent a subset of all interactions that can occur. These results suggest that more comprehensive screens may be required to exhaustively delineate the total set of possible interactions for the two hosts.
The largest numbers of interactions were detected for proteins in groups 1 and 2, i.e. for proteins with multiple evidences, whereas the number of detected interactions decreased with a decreased number of evidences and an increased fraction of hypothetical proteins within a group. Thus, the proteins in group 4, derived from ab initio prediction methods and annotated as enzymes or hypothetical proteins, did not notably interact with host proteins in our Y2H screens. However, a lack of host interactions does not necessarily indicate a lack of a potential biological role, as proteins may function as modulators of metabolites or other host factors without involving detectable protein interactions.
Known Virulence Factors and Their Virulence Phenotypes-Among the known virulence factors shown in Table III, those linked to the T3SS in group 1 interacted with at least one host protein. Some of these proteins, e.g. BipD, BipB, BopA, and BopE, have already been shown to be required for different aspects of pathogenicity in different animal models. For example, the BipD mutant of B. pseudomallei has been shown to attenuate infection after intraperitoneal challenge of BALB/c mice (67), whereas the BipB mutant of B. pseudomallei was not able to produce multinucleated giant cells during intranasal challenge of the same host and, consequently, improved host survival time (19). These two proteins, along with BipC, are positioned at the tip of the T3SS needle and form translocation pores (56). Furthermore, studies have shown that the effector (secreted protein) BopA is required for intracellular survival of both B. pseudomallei and B. mallei in phagocytic cells (56) and the B. pseudomallei effector BopE is required for optimal invasion of nonphagocytic cells. However, because BopE deletion mutants showed hardly any attenuation during intraperitoneal or intranasal infection of B. pseudomallei in BALB/c mice (56,67), these results indicate that multiple effectors, possibly in concert with BopE, contribute to a robust infection.
Group 1 also contained a number of proteins from T6SS, the majority of which were annotated as proteins containing a VgrG domain. A number of large VgrGs (complex trimeric proteins containing transmembrane parts and a C-terminal domain) has been identified in Gram-negative bacteria and shown to be secreted and interact with host cells (85). The largest VgrG protein in B. mallei, BMAA0737, is an ortholog of the B. pseudomallei VgrG protein, BPSS1503, a protein identified as essential for virulence (11). Although we were unable to successfully clone BMAA0737, we identified a handful of PPIs for three of the five VgrG protein constructs we could assay. Two of them, BMAA0445 and BMAA0446, belonging to the second T6SS cluster, were recently identified in B. pseudomallei as important for interactions with other bacteria (86), whereas the third one, BMAA1269, was located outside of any known T6SS clusters (13). The small number of identified PPIs for VgrG proteins is likely a consequence of difficulties in expressing large proteins with transmembrane domains (87). Preliminary data also suggest that recombinant expression of VgrG proteins in Escherichia coli is challenging (K. Kwon, unpublished data).
In addition to T3SS and T6SS proteins, group 1 also contained BimA, a well-characterized auto-transporter of B. pseudomallei and B. mallei that is exported through a type 5 secretion system (12). The role of BimA in actin-based motility is accepted, but the mechanisms are not fully understood, and it is likely that they differ between B. pseudomallei and B. mallei (12). In B. mallei, BimA was not required for pathogenicity in a hamster model of glanders (13), whereas in B. pseudomallei it was found to contribute to pathogenicity in murine models (12).
Thus, although many of the known virulence factors in group 1 did include a large number of interactions, their specific biological function and resultant infectious phenotype varied dependent on both the animal model and the pathogen species used. Given that the three B. mallei proteins, BMAA0728, BMAA1865, and BMAA0553, all similarly attenuated virulence in our animal model and, hence, generated a common nonvirulent infectious phenotype, we now turn to our Y2H interaction data to infer and discuss the putative role of these novel virulence factors in invasion and survival within the host cell.
Novel Virulence Factors and Their Host Protein Interactions-Focusing on the three novel B. mallei virulence factors and their interacting human and murine proteins (Table III), these proteins participated in 192 and 236 PPIs with human and murine proteins, respectively (supplementary Data set S3 and supplementary Data set S4), containing 170 unique human proteins and 191 unique murine proteins. Fig. 4 graphically shows these protein interactions for both hosts. There are 12 conserved (or overlapping) PPIs between these two data sets, containing 10 unique host proteins. To maximize the coverage of interactions, we combined all human-and murine-B. mallei PPIs detected in our screens in subsequent analysis. This resulted in a data set consisting of 382 interactions among three B. mallei proteins and 336 human proteins (170 human proteins ϩ 166 human-murine orthologs).
Pathogen Targeting Ubiquitin-related Processes-We found that all three proteins interacted with a statistically significant (p Յ 2 ϫ 10 Ϫ3 ) number of host proteins involved in ubiquitinrelated processes (supplemental Table S5). This finding is noteworthy, as intracellular pathogens are known to interfere with host ubiquitination processes, thus disturbing immune signaling cascades and attenuating the immune response (88 -91). Additionally, pathogens interfere with the host ubiquitin proteasome system to prevent their degradation as well as to ensure their destruction when no longer required for establishing the infection (88 -91).
One of the major known B. mallei virulence factors in the modulation of host ubiquitination system is TssM (BMAA0729), a protein that encodes a broadly acting deubiquitinase (17,92). Although the exact function of TssM is unknown, it has been suggested that it acts to remove ubiq-uitin from pathogen proteins, preventing their recognition and, consequently, the activation of early inflammatory responses (64). Using the PFAM-A database (43), we mapped the ubiquitin C-terminal hydrolase (UCH) domain to TssM. Interestingly, we found that the identified BMAA0728 protein (TssN) interacted with human and murine proteins that contain the UCH domain. This implies that it is possible that BMAA0728 also interacts with the UCH domain of its chromosome-adjacent protein TssM, a protein with which it is also coexpressed (D. DeShazer, unpublished data). Even though we tried to identify PPIs for TssM (Table III), we were unable to obtain a significant number of interactions for this protein, despite the large number of interactions detected for TssN. We further found that BMAA0728 interacts with the polyubiquitin-B protein (UBB) and with the cullin-1a protein (CUL1), a core component of multiple cullin-RING-based SCF E3 ubiquitin-protein ligase complexes. From the available human PPI data (42), we found that UBB and CUL1 physically interact with TNF receptor-associated factor 6 (TRAF-6) and I-kappa-B inhibitor alpha (IkB-␣), two host proteins that have been shown to be targets of TssM (17,92). These observations lead to the hypothesis that BMAA0728 either interferes with host ubiquitination processes directly or in collaboration with TssM. However, based on currently available domain annotations, no known domain could be assigned to BMAA0728.
Pathogen Targeting Host Entry Mechanisms-We identified Y2H interactions between BMAA0728 and GABA(A) receptorassociated protein-like 1 (GABARAPL1), a host protein possibly involved in the closure of phagosomal membranes (93,94). This observation implies a possible role of BMAA0728 in B. mallei (auto)phagosomal escape. Besides GABARAPL1, we found many other murine and human host proteins that are involved in processes that can be related to cellular internalization or degradation escape and evasion and that interact with all three proteins (BMAA0728, BMAA1865, and BMAA0553; supplemental Table S5). However, based on the available data, we cannot determine a precise role for these three proteins in (auto)phagosomal escape/evasion processes.
To gain entry into host cells, multiple pathogen proteins target the host cytoskeleton through direct interactions with actin cytoskeleton regulators (95,96). One such protein, Rho family GTPase (guanosine triphosphatase) CDC42, is a known pathogen target that induces actin rearrangement in the host cell (64, 96 -98). Previous studies have suggested that B. pseudomallei BopE (B. mallei orthologous protein BopE) acts as a guanine nucleotide exchange factor for the Rho family of GTPases and that, through an interaction with a Rho GTPase CDC42, B. pseudomallei facilitates membrane ruffling and uptake into the host cell (64,99). Studies have also implied that multiple pathogen proteins may act together with BopE to facilitate bacterial invasion (64,99). Our study identified that BMAA1865 also interacts with RhoGTPases, such as CDC42 and RhoH (Ras homolog gene family, member H), as well as with a number of other proteins involved in processes related to actin cytoskeleton rearrangement (supplemental Table S5), and we hypothesize that BMAA1865 represents another B. mallei protein involved in host membrane ruffling and bacterial invasion.
Pathogen Targeting Phosphorylation Signaling-Although the B. mallei serine/threonine phosphatase BMAA0553 did not interact with any host protein already known to be involved in bacterial invasion and survival, we found it interacted with a large number of host proteins involved in various signaling processes and, hence, it is likely that it interferes with host intracellular signaling pathways (supplemental Table  S5). We found that BMAA0553 interacts with Rho GDP (Rho family of guanosine diphosphate) dissociation inhibitor-␤ (AR-HGDIB), which regulates the GDP/GTP exchange reaction of Rho proteins, as well as with CDC42. Furthermore, as serine/ threonine phosphorylation has been linked to actin cytoskeleton rearrangement processes (97,100), it is possible that BMAA0553 also has a role in cytoskeletal rearrangement.
Understanding Host-Pathogen Interactions-Given the inherent nonspecific protein interactions and the lack of exhaustive experimental testing of protein interactions, our ap-proach was to delineate the possible effects of select B. mallei protein by investigating which host processes they would have a high likelihood to interfere with. Indeed, the new virulence factors we identified appear to have a multifunctional biological role, perhaps because of nonspecific interactions.
The nonspecific nature of the interactions is both a source of weakness and strength in the ability of the pathogen to establish an infection. Avoiding highly specific and strong interactions avoids the risk of being limited to infecting only those hosts that have the requisite specific proteins. However, it requires that the pathogen maintain many, possibly alternative, indiscriminate interactions to achieve the same goal. The apparent multivariate capability of the virulence factors is also commensurate with the notion of using multiple, possibly overlapping, interactions to influence the host environment and to maximize the likelihood of establishing successful infections in multiple hosts and host environments.

CONCLUSIONS
Our study is the first study to specifically and systematically generate host-protein interactions associated with known and putative virulence factors of B. mallei. We used bioinformatics approaches to identify 49 proteins involved in B. mallei pathogenicity. UsingY2H assays to screen these proteins against normalized whole human and murine proteome libraries, we identified a large number of host-pathogen PPIs. Analyzing these interactions allowed us to characterize host proteins targeted by B. mallei proteins and to identify three novel virulence factor candidates. Using a mouse aerosol challenge model, we demonstrated that these three proteins attenuated B. mallei ATCC 23344 virulence, implicating them and their host targets in B. mallei pathogenicity.