The genetic basis of variation immune defense against Lysinibacillus fusiformis infection in Drosophila melanogaster

The genetic causes of phenotypic variation often differ depending on the population examined, particularly if the populations were founded by relatively small numbers of genotypes. Similarly, the genetic causes of phenotypic variation among similar traits (resistance to different xenobiotic compounds or pathogens) may also be completely different or only partially overlapping. Differences in genetic causes for variation in the same trait among populations suggests considerable context dependence for how selection might act on those traits. Similarities in the genetic causes of variation for different traits, on the other hand, suggests pleiotropy which also would influence how natural selection would shape variation in a trait. We characterized immune defense against a natural Drosophila pathogen, the Gram-positive Lysinibacillus fusiformis, in three different populations and found almost no overlap in the genetic architecture of variation in survival post infection. However, when comparing our results to a similar experiment with the fungal pathogen, B. bassiana, we found a convincing shared QTL peak for both pathogens. This peak contains the Bomanin cluster of Drosophila immune effectors. RNAi knockdown experiments confirms a role of some of these genes in immune defense against both pathogens. This suggests that natural selection may act on the entire cluster of Bomanin genes (and the linked region under the QTL) or specific peptides for specific pathogens. Author Summary Like most traits, the way individuals respond to infection vary among individuals within a population. Some of this variation is caused by genetic differences in the host organism. Over the past decade, two prominent resources were developed to assess genetic variation for complex traits of the fruit fly, Drosophila melanogaster and map the genetic variants responsible. We recently described a strain of Lysinibacillus fusiformis bacteria, which was isolated from fruit flies and is moderately virulent when flies are infected. We mapped genetic variation in resistance L. fusiformis using these mapping resources. We find that among the resources, different changes were associated with immune defense. However, we also found that within a resource, the same region of the genome was associated with resistance to both L. fusiformis and a fungal pathogen. These results suggest that different populations adapt differently to the same pathogens, but two distinct pathogens share similar causes of genetic variation within a single population.

based on other studies [46][47][48] and are mostly antimicrobial peptides in the Toll and Imd pathways.
We used an adjusted P-value threshold of 0.1 for Gene Ontology enrichment analysis, which yielded 36 genes differentially expressed. Gene Ontology categories significantly enriched included several related to immune defense: response to bacteria, response to external biotic stress, defense response, immune response, defense response to Gram-positive bacterium, defense response to Gram-negative bacterium (see the entire list in Supplemental Table 1).
It appears that the induced antimicrobial peptides are more squarely induced by the Imd pathway instead of the Toll pathway, suggesting induction of gene expression is more similar to Gram-negative bacteria than Gram-positive bacteria. The Lysinibacillus genus are Gram-positive bacteria, but like others in the Bacillales order, the peptidoglycan is not l-lysine-(lys) type. Instead, most Lysinibacillus examined have the A4alpha (Lys-Asp) peptidoglycan [49,50]. Since the humoral response in D. melanogaster is driven in part by recognition of peptidoglycan type, the expected relative responses of the Toll and Imd responses to Bacillales is less clear cut than for most bacteria that fit more neatly into Gram+ and Gram-categories based on peptidoglycan [14]. We calculated Pearson correlation coefficients for Log 2 Fold Change (infected vs. uninfected) between our L. fusiformis data and 12-hour post infection gene expression for 11 bacterial infection treatments from the Troha et al. [46] data. Although all correlations were significant (P<0.0001), there was stronger correlation between L. fusiformis and Gram-negative bacteria (median = 0.4) than any of the Grampositive bacteria (median = 0.32, Figure 1C). The same pattern is true when we only considered significantly differentially expressed genes or only significantly induced genes (data not shown). Thus, L. fusiformis is a moderately virulent bacterial pathogen that induces a strong innate immune response that is more similar to the transcriptional response to Gram-negative than Gram-positive bacteria.

Genetic variation for immune defense against L. fusiformis in the DGRP
All three populations (DGRP and two DSPR populations) showed considerable genetic variation for survival after infection with L. fusiformis (Figures 2A, 3A and 4A). Raw survival in the DGRP population ranged from 10% to 100% surviving 5 days post infection (DPI). We also examined survival 2 days post infection, which was much higher overall and showed less variation (Supplemental Figure 1). Like several other studies of immune defense in the DGRP, we found that line 321 was particularly susceptible to infection, so we calculated line effects with and without line 321 (Supplemental Table 2 Tables 3-6). Sixteen SNPs were significantly associated with survival 5 DPI when all lines were considered. Eleven intronic SNPs were in eys/CG9967, Pvr, CG44153, Trim9, CG15611 (4 SNPs), trio, CG32698 (actually a 2bp deletion), and Frq1. One SNP was in the exon of the noncoding RNA CR44310, one SNP caused a silent change in CG14971, and 3 SNPs were intergenic. When we omitted line 321, only 3 SNPs were significantly associated with 5-day survival - all three were intronic SNPs in Pvr. Note that the SNPs significantly associated when all lines were included but not significantly associated when 321 was omitted were mostly (13 out of 14) still tested, but had low enough minor allele frequencies that power was lost and P-values were above our significance threshold. Only one fell below our minor allele frequency threshold and so was not tested. Eight SNPs were significantly associated with 2-day survival when all lines were considered. This included 4 intronic SNPs in Zasp52, trio, CG32364, and Plod as well as 4 SNPs in the 3' UTR of CG15564. There were no significantly associated We investigated genes connected to significantly associated SNPs in a few ways. First, we examined whether any of these genes were differentially expressed 12 hours after infection based on our RNA-seq data.
None were significantly differentially expressed even if we consider the nominal P-values, so genes connected to significantly associated SNPs are not significantly induced or repressed upon infection (at least at the 12hour point).
Next, we examined annotated functions of these genes. Two stand out. First, Pvr (PDGF-and VEGFreceptor related) is involved in the regulation of several innate immune functions including apoptosis, humoral immune response, JNK signaling and wound healing [51][52][53][54][55]. Pvr is a negative regulator of humoral immune response, so if these SNPs are associated with alternative splicing or otherwise influence Pvr expression, that would be a plausible connection. The other gene of interest is CG44153, which is predicted to enable cell-cell adhesion mediator activity and is active in the plasma membrane [56]. It stands out because it was also identified in two other DGRP immunity association studies: survival time after infection with Pseudomonas aeruginosa [21], and microbiota-related diet shift [57]. However, CG44153 also was reported to harbor genetic variation with several other non-immune phenotypes, so it is possible that there is something about genetic variation at the locus that makes it prone to false positive associations.

Little evidence for correlations among immune phenotypes in the DGRP
The DGRP lines have been available to the public for more than a decade, and in that time, there have been several studies attempting to map variation in immune defense against a range of pathogens. This allows us to begin to determine whether immune defense against different pathogens is correlated. Are lines that are better at fighting infection against one pathogen, generally better at fighting infection? If so, is this true universally or within categories such as Gram-positive bacteria? A general correlation in immune defense would suggest that at least some of the genetic basis of that variation is also shared for these different pathogens. There is some evidence for this in the overlap in associations in this study and those of Chapman et al. and Wang et al. [1,21]. However, it is also possible that resistance is much more specific for each pathogen. This is likely given some of the specificity of associations -particularly for Providencia rettgeri and Diptericin [5,20,37].
We calculated Spearman's rank correlations for 14 DGRP immune-related data sets plus an additional 6 unrelated phenotypes ( Figure 2C). Though there are several positive correlations, few are significantly correlated after Benjamini & Hochberg correction. Correlations tend to be between the same microbe using different experimental design (different sex or survival vs. bacterial load) or between different microbes measured in the same laboratory (E. faecalis and L. fusiformis were both measured in our laboratory at the same basic time; M. anisopliae and P. aeruginosa were reported as part of the same experiment in Wang et al. [21]). However, there is an almost complete lack of significant correlation between immune phenotypes and any of the other phenotypes, the exception being a positive correlation between female survival after P. aeruginosa and male starvation survival. Given the general lack of correlation between phenotypes, we decided not to look at the overlap in significantly associated SNPs between immune phenotypes. Our correlation results suggest that the DGRP immune phenotypes that have been reported do show correlation, but it is difficult with the available data to determine if these correlations are due to phylogenetic relatedness of the microbes used or are an artifact of something about the experimental design (how phenotypes were defined, dose, etc.) or labs where the experiments were performed.

Genetic variation for immune defense against L. fusiformis in the DSPR
The DSPR consists of two populations (A and B) that were derived from eight different founder genotypes. Each of those populations was further split into two parallel sets (e.g. A1 and A2) which were derived from the same founders, but different interbreeding populations. We infected 343 lines from population iterations. This yielded several QTL peaks with LOD scores above the threshold, but they concentrate in two regions ( Figure 3B). The first peak (qtl A5.1) was centered at Chr2R:3,420,000 (LOD drop 2 range 3,360,000 to 3,470,000, note this is release 5 of the D. melanogaster genome, LOD = 8.59) and was associated with 10.96% of the variation in the phenotype. We extracted the effects of individual founder genotypes (Supplemental Figure 7A) and found that founder A1 genotypes at this region had significantly higher survival than the other four genotypes represented. This includes all or part of 21 genes, none of which have obvious immune function, nor were any differentially expressed in our RNA-seq data after p-value correction. One gene, Cyt-b5 showed an almost 50% reduction in expression upon infection, which was associated with a nominal p-value of 0.02. Importantly it is annotated to be involved in metabolism of xenobiotic compound, but the naïve expectation if it is breaking down microbial toxins would be induction, not suppression upon infection. The second peak (qtl A5.2) had a higher LOD score and was also much broader. We focused on the main peak (highest LOD score) which was centered at Chr2R:14,720,000 (LOD drop range from 14,680,000 to 14,770,000, LOD = 21.19) and was associated with 24.89% of the variation. This peak contains all or part of 19 genes including two genes involved in immune defense: DptA and DptB. The founder genotype from A5 is associated with significantly lower survival than the other represented genotypes (Supplemental Figure 7B). If we expand to look at genes under the three adjacent (actually overlapping) peaks with LOD scores above 20, we also include the gene, Jabba, which also functions in immune defense. One more QTL peak still overlapping these but with a LOD score of 14.7 (peak 20) contains the Bomanin gene cluster (eight small, secreted peptides involved in immune defense) as well as Imd (receptor for the Imd pathway). Line A5 harbors two unique nonsynonymous SNPs in Jabba exon 6 as well as a loss of stop codon in BomT2 (a Bomanin). The  Table   S9). Heritability was 0.674 for survival 5 DPI and 0.760 for survival 2 DPI. The QTL scan did not result in any significant peaks with alpha = 0.05 ( Figure 4B). In fact, we had to relax alpha to 0.20 to get significant peaks.  Figure 8C). Scanning through variation in these genes, we did not find any sites limited to B5 and B7 that were immediately good candidates for being causal.  Table S9 and S10). In fact, there was an additional peak at Day 2 (qtl B2.1, Supplemental Figure 9). This peak is only significant with alpha = 0.1, and is on 3L at 2,880,000 (LOD score = 7.57, 2 LOD drop range: 2,850,000 to 3,100,000, 8.53% of variance explained). Founder line B3 genotypes had a significantly lower survival than others (Supplemental Figure 7D). The peak includes 27 genes including spz-5 (a paralog of the canonical Toll pathway Spaetzle gene). These genes were not differentially expressed in the transcriptome data.  Table   S11). In contrast to what Shahrestani et al. [43] reported, we did find one peak significantly  C. 13.15% of variance explained). The A5 founder genotype had significantly lower survival than other lines ( Figure 5C).
Thus, the B. bassiana peak is quite similar to the location of qtl A5.2 for L. fusiformis, even though the two experiments were performed in different subsets of the A population ( Figure 5B). To test whether this overlap is more than expected by chance, we performed a Fisher's Exact Test with the fraction of physical positions with LOD score greater than the threshold score (alpha = 0.05). Day 5 survival for L. fusiformis had 289 of 11768 significant, while B. bassiana had a single position significant of 11768. That one position significant for B. bassiana was also significant for L. fusiformis, which is a significant excess over the expectation (expected overlap = 0.0246, Fisher Exact Test P-value < 0.0001). Furthermore, the founder genotype effects for both phenotypes clearly show reduced survival for the A5 genotype compared to all other genotypes present (this is true whether we match the peaks as in Figure 5C

Discussion
Populations face infection challenge from innumerable different pathogens. The genetic factors that influence susceptibility to infection may vary from population to population for a specific pathogen, and from pathogen to pathogen within a specific host population. We examined genetic variation for susceptibility to the Gram-positive bacterium, L. fusiformis infection in 3 Drosophila mapping populations and find no evidence that the same genetic factors contribute to phenotypic variation in susceptibility across the three populations. We also compared the genetic causes of phenotypic variation for susceptibility to L. fusiformis and the fungal pathogen, B. bassiana [43]. Surprisingly, even though these experiments were conducted by different researchers in different labs with different subpopulations, there was striking concordance of a single quantitative trait locus (QTL) for both pathogens. Although that QTL peak was broad, it contains a cluster of immune peptides, some of which are induced upon infection, and show increased susceptibility to both infections upon RNAi knockdown.
We hoped to determine the extent to which different genotypes are correlated in how they defend against infection. Are some genotypes universally more resistant to infection? Is correlation stronger when pathogens are more closely related? Do correlations suggest tradeoffs among immune phenotypes (pathogen X vs. pathogen Y) or between immune phenotypes and other life history traits? The DGRP lines are widely used for measuring phenotypic variation and attempting to map the genetic causes of that variation. As such, there are several studies on both immune and life history traits to examine genotypic correlations. Although we did find evidence of correlation, it was difficult to tease apart whether these relationships were truly biological or were artifacts -stronger correlations tended to be between phenotypes measured in the same lab or institution (different pathogens in the same lab, the same phenotype for each sex). In fact, there were no significant negative correlations, which would be expected with strong tradeoffs. The only exception to the general lack of signal (positive or negative) was the positive correlation between male starvation stress resistance and female P. aeruginosa survival. Although this might suggest that some genotypes are better at responding to general stress (starvation or infection) than others, it is curious that the correlation is only significant for one infection type and not even in males infected with P. aeruginosa. We therefore hesitate to draw conclusions from this correlation.
In contrast to the DGRP experiments, we find a striking shared QTL peak comparing infection with both L. fusiformis and B. bassiana in the DSPR lines. We are unable to measure the correlation between genotypes since the two experiments were performed in two different DSPR subpopulations. Note that both subpopulations were initiated from the same set of 8 founder genotypes, so the genetic variation present in the two subpopulations is similar. The shared peak for two very different pathogens could mean several things.
First, there are many genes under that peak and even many paralogs of the genes encoding Bomanin peptides. Therefore, it is possible that the genetic basis of increased susceptibility to L. fusiformis in the A5 background is different from the genetic basis of increased susceptibility to B. bassiana in the A5 background.
If the phenotypic variation for the two pathogens was caused by different, but unique genetic changes, we imagine that natural selection might act on these linked variants as a single unit. Second, there may be a shared genetic basis for resistance to infection. Perhaps a single Bomanin (or small subset of Bomanins) are responsible for most of the response to infection, but in a nonspecific way. At this point, we know very little about how specific Bomanins act in the immune system (Table 1) Our approach enabled us to also determine whether the same genetic factors influence immune defense in different populations. We acknowledge that none of the three populations are ideal populations: DGRP are inbred lines that survived the lab environment after capture at a farmer's market, DSPR are derived from several lines from around the world and made into a "synthetic" population. Nonetheless, there was no evidence that any of the three populations shared genetic variation in any particular region that influenced survival after L. fusiformis infection. This may be because pathogen pressure varied in the populations from which these lines were derived. It may also be that immune alleles tend to turn over rapidly as populations coevolve with pathogens meaning that the alleles best suited to fight infection in one population are missing from another population. Finally, the small founder populations in the DSPR may preclude anything but high frequency variants from being detected in multiple experiments.
Several recent studies have attempted to characterize the role of individual Bomanin genes, the entire cluster of 10 genes, or subsets of those genes (summarized in Table 1) [1-3, 26, 46]. This began with Clemmons et al. [26] who characterized the gene family and showed that deletion of the entire cluster rendered flies highly susceptible to infection with E. faecalis, C. glabrata and F. oxysporum, but not the Gram-negative E. cloacae. The authors also showed the first evidence that there might be some specificity of activity since deleting the six left paralogs of the Bomanin cluster resulted in high susceptibility to E. faecalis and F. oxysporum, but not C. glabrata. To investigate further, Lindsay et al. [2] infected flies with four Bomanin transgenes on a Bomanin cluster partial deletion with C. glabrata. Two transgenes (BomS3 and BomS5) restored survival to wildtype levels, one transgene (BomS6) showed intermediate restoration of survival, and one (BomT2) showed no restoration of wildtype survival. As expected, the expression of these genes after infection correlates with their ability to restore wildtype survival. When expressed with the BomS3 promoter, BomS6 was not susceptible to C. glabrata infection, but BomT2 showed no difference in susceptibility. Xu et al. [3] showed that BomBc1, BomS3 and BomS4 were significantly induced upon A. fumigatus infection and that, as a whole, the Bomanin cluster enhances both resistance to, and tolerance of, infection with A.
fumigatus. BomBc1, BomS3 and BomS6 increased tolerance to the fungal toxin, restrictocin, whereas Bom S6 and BomS1 increased tolerance to another toxin, verruculogen. Here, we showed that natural variation in immune defense against both the Gram-positive L. fusiformis and the fungal B. bassiana map to a QTL that contains the Bomanin cluster and that knockdown of specific Bomanin copies result in enhanced susceptibility (note that we do not differentiate between tolerance and resistance). Unfortunately, most of these studies utilize a subset of the Bomanins and use different techniques (CRISPR/Cas9 knockouts, transgenic lines, RNAi), so a systematic understanding of the specificity of individual Bomanins or the subclasses (tailed, short and bicipital) remains elusive.
This study highlights the context-dependence of the causes of genetic variation in immune defense.
Within the same species, different populations harbor different genetic variants that result in variation in immune defense. Surprisingly, however, within a population, two very different pathogens share a QTL for susceptibility to infection, with a single founder genotype being significantly more susceptible than others.
Although our approach does not have the resolution to determine whether the causal genetic variants are the same or just genetically linked, this linkage allows selection to act more broadly to reduce susceptibility when susceptibility to such disparate pathogens is governed by a single locus.

Bomanin
Old  Table 1. Summary of Bomanin gene immune phenotypes. *Flysick induction indicates which of the ten microbes resulted in at least 10-fold higher expression compared to uninfected individuals. "G+" means all Gram-positive bacteria (M. luteus -Ml, E. faecalis, S. aureus) induced at least 10fold. Two Gram-negative bacteria out of 7 also led to a 10-fold increase in expression: Pseudomonas entomophila (Pe) and Escherichia coli (Ec).  IN). Canton S is a standard laboratory Drosophila line. All Drosophila stocks were maintained on standard molasses media at 23 degrees Celsius on a cycle of 12 hours light to 12 hours dark.

Drosophila strains and husbandry
We refer to both sets of lines by their numeric assignment (for the DGRP we refer to line 321, not RAL_321 or line_321).

Microbes and microbiology
Lysinibacillus fusiformis (strain Juneja) is a natural pathogen of Drosophila obtained from Brian Lazzaro at Cornell University [45]. Bacteria were grown from single colonies in LB liquid media overnight to stationary phase, then concentrated to an optical density (600nm) of about 4.0. Beavaria bassiana (strain GHA) was obtained from Parvin Shahrestani (California State University at Fullerton). Spores were grown in malt extract agar, then resuspended with a 10ml syringe, filtered through a 0.22 m Miracell filter and resuspended in 20% glycerol at 4.3e08 spores per mL as measured on a hemocytometer.

General Infection protocol
Flies were tipped to new media and cleared after 2-4 days of egg laying to control larval density. Ten to fourteen days later, emerging adults were moved to new media vials. Three-to seven-day old males were

Infection dynamics in Canton S
The effect of L. fusiformis infection on Canton S survival was assessed using a Cox Propotional Hazard test implemented in the R packages survival and survminer [59,60] with treatment (infected vs. sterile prick) as the independent variable.

RNA-sequencing of infection in Canton-S males
We These values were used with the DGRP website [30] to perform association mapping. Thus, there were four total mapping experiments for the DGRP: 2-and 5-days post infection and with and without line 321. We calculated heritabilities using line variance and total variance from the VarrCorr function in lme4.
We next asked whether line effects correlated between DGRP infection experiments (and a few other traits). We retrieved line effects for both male and female 24 hour bacterial load and female survival post Providencia rettgeri infection [5,37], male and female Coxiella burnetii hazard ratio [67], female Pseudomonas entomophila enteric survival [68], male and female Pseudomonas aeruginosa survival [21], male and female Metarhizium anisopliae survival [21], Staphylococcus aureus phagocyte mobilization [69], Listeria monocytogenes bacterial load [70], Enterococcus faecalis survival [1], male aggression [71], male starvation resistance [30], pigmentation (tergite 5) [72], lifespan and fecundity [73], and weight [34]. We calculated Spearman's rank correlations for these 20 phenotypes and grouped by non-immune, Gram-positive/fungal and Gram-negative pathogens. In these analyses, we used the negative of the load values which means that all immune phenotypes have higher values indicating better immune defense and lower values meaning worse immune defense.

Genetic variation for immune defense in the DSPR
The DSPR experiment was similar to that for the DGRP, but with more lines and fewer individuals We also retrieved raw data for survival after B. bassiana infection [43] and performed a logistic regression analysis with random effects of line, replicate nested in line, group and group nested in set (as in their analysis). This differs from the authors' approach who used arcsin-square root transformed survival. All other analyses were performed as above.
RNAi knockdown experiments to examine the role of individual genes in immune defense.
We followed up our QTL mapping by examining several genes that occur under the main peak on chromosome 2R. We obtained the TRIP RNAi lines [74] from the Bloomington Drosophila Stock Center (Bloomington, Indiana, USA) and crossed to the C564 (RRID: BDSC_6982) fat body-specific driver [75] . We also crossed the driver to an empty vector and RFP RNAi (when available) as negative controls. Infections were performed as described above for L. fusiformis, but B. bassiana were monitored for 21 days instead of 7 days, with flies transferred to new food every 3-4 days. In addition, we mock infected lines with either Luria-Bertani broth or 20% glycerol as a sterile wound control. Very few of these flies died from the wound. We assessed significant differences in survival using the coxph function from the survival package in R [59,76] with genotype and date as variables and the empty vector as the reference. Since the RNAi lines are on different backgrounds (AttP2 and AttP40), we performed separate analyses based on background.