Identification of new loci involved in the host susceptibility to Salmonella Typhimurium in collaborative cross mice

Background Salmonella is a Gram-negative bacterium causing a wide range of clinical syndromes ranging from typhoid fever to diarrheic disease. Non-typhoidal Salmonella (NTS) serovars infect humans and animals, causing important health burden in the world. Susceptibility to salmonellosis varies between individuals under the control of host genes, as demonstrated by the identification of over 20 genetic loci in various mouse crosses. We have investigated the host response to S. Typhimurium infection in 35 Collaborative Cross (CC) strains, a genetic population which involves wild-derived strains that had not been previously assessed. Results One hundred and forty-eight mice from 35 CC strains were challenged intravenously with 1000 colony-forming units (CFUs) of S. Typhimurium. Bacterial load was measured in spleen and liver at day 4 post-infection. CC strains differed significantly (P < 0.0001) in spleen and liver bacterial loads, while sex and age had no effect. Two significant quantitative trait loci (QTLs) on chromosomes 8 and 10 and one suggestive QTL on chromosome 1 were found for spleen bacterial load, while two suggestive QTLs on chromosomes 6 and 17 were found for liver bacterial load. These QTLs are caused by distinct allelic patterns, principally involving alleles originating from the wild-derived founders. Using sequence variations between the eight CC founder strains combined with database mining for expression in target organs and known immune phenotypes, we were able to refine the QTLs intervals and establish a list of the most promising candidate genes. Furthermore, we identified one strain, CC042/GeniUnc (CC042), as highly susceptible to S. Typhimurium infection. Conclusions By exploring a broader genetic variation, the Collaborative Cross population has revealed novel loci of resistance to Salmonella Typhimurium. It also led to the identification of CC042 as an extremely susceptible strain. Electronic supplementary material The online version of this article (10.1186/s12864-018-4667-0) contains supplementary material, which is available to authorized users.

Background Salmonella is a Gram-negative bacterium responsible for typhoid fever and diarrheic disease. It is one of the leading causes of food-borne infections and remains a major threat for human population [1][2][3]. Non-typhoidal Salmonella (NTS) serovars, especially Salmonella enterica serovar Typhimurium, infect both humans and animals, cause a significant disease burden with an estimated 93.8 million human cases and 155,000 deaths worldwide each year [4]. The variable outcome of Salmonella infections depends on many parameters, including the bacterial strain, environmental factors and host genetic makeup [5]. The identification of host genetic variants associated with increased resistance to the infection reveals critical mechanisms in the complex interplay between the bacteria and their host, and is instrumental to the development of effective therapies. Infection of mice with Salmonella Typhimurium is widely used as an experimental model of human typhoid fever [6]. In infected mice, either orally or intravenously, there is rapid localization and replication of the bacterium in the spleen and the liver, with no clinical signs of gastroenteritis before systemic infection [7]. Laboratory mouse strains display a wide range of susceptibilities [8,9]. C57BL/6 J (B6) strain is extremely susceptible with high spleen and liver bacterial load and death around day 5-6 post-infection, while most 129S substrains are highly resistant with low spleen and liver bacterial load and survive [10,11]. Significant advances in understanding the host response to Salmonella infection have been made over the years with the identification of genes such as Slc11a11 (Nramp1), Tlr4 and Btk, first in the mouse model [12][13][14][15] and later in other animal species [16,17]. SLC11A1 is expressed in the membrane of macrophages and neutrophils, and controls the replication of the bacteria by altering the Salmonella containing vacuole (SCV) maturation. B6 inbred strain is susceptible to Salmonella Typhimurium infection due to a single Gly169Asp (G169D) mutation in the predicted TM4 domain of the SLC11A1 protein, resulting in the absence of mature protein in membrane compartment. TLR4 is known by its fundamental role in bacterial outer membranes lipopolysaccharide (LPS) recognition and activation of innate immunity. BTK tyrosine-protein kinase plays a critical role in the regulation of B cell receptor signaling [5,18].
Strategies using backcrosses or intercrosses between susceptible and resistant mouse strains to S. Typhimurium have been used to map various quantitative trait loci (QTLs) involved in these susceptibility differences [9,10,19]. QTLs confidence intervals identified in such crosses are usually broad, and identifying the causative gene(s) can be very challenging [20]. To overcome this problem, a large panel of new inbred mouse strains, namely the Collaborative Cross (CC), was developed over the last decade through a global community effort [21]. The CC strains are recombinant inbred strains derived from eight distinct founder strains that include five classical laboratory strains combined with three wild-derived strains [22,23]. CC strains represent a genetically heterogeneous population with an even distribution of allelic variation, and a distribution of allele frequencies which closely resembles that found in human population [24]. It accounts for almost 90% of the known genetic variation present in laboratory mice originating from M. musculus with more than 35 million SNPs segregating between the CC founders [24]. Almost all CC have approximately equal contributions from each founder, their genome contains more recombinant events and over 90% of loci are homozygous with known genotypes [24][25][26][27].
In this study, we used the CC mouse population to identify new loci involved in the complex host response to Salmonella Typhimurium infection. We challenged 148 mice from 35 CC strains with 1000 CFU of Salmonella Typhimurium and identified two significant and one suggestive QTLs associated with spleen bacterial load, and two suggestive QTLs associated with liver bacterial load. We found that wild-derived alleles contributed largely to the effects of these QTLs. Using sequence variations of the CC founder strains combined with gene expression analysis, we identified promising candidate genes within each QTL interval. We also identified CC042/GeniUnc (CC042) as an unusually susceptible CC strain that may have further use as a model for studying Salmonella Typhimurium infection.

Animals and ethics approval
Collaborative Cross mice were purchased from the Systems Genetics Core Facility at the University of North Carolina (UNC) [27], previously generated and bred at Tel Aviv University in Israel [28], Geniad in Australia [29], and Oak Ridge National Laboratory in the US [30], and further bred and maintained at the Institut Pasteur under specific-pathogen-free (SPF) conditions. 129S2/ SvPasCrl (129) and C57BL/6 J SPF mice were obtained from Charles River France and used as resistant (low bacterial loads in target organs) and susceptible (high bacterial loads) controls respectively. Tlr4 knock-out deficient mice on B6 background (B6.129-Tlr4 tm1Aki ) were kindly provided by Jean-Marc Cavaillon (Institut Pasteur).
All animal breeding and experiments conformed to European Directive 2010/63/EU and the French regulation of February 1st, 2013 on the protection of animals used for scientific purposes. Institut Pasteur's Animal Ethics Committee-CETEA (registered by French Research Ministry under n°89) approved experiments under numbers HA0038 and 2014-0050.

Salmonella Typhimurium infection
Mice were confined in a biosafety level 3 (BSL-3) animal facility a few days prior to infection. Salmonella Typhimurium strain SL1344, obtained from the National Collection of Type Cultures (NCTC 13347), was used for infection. One mL of bacteria frozen culture was grown in 50 mL of Trypticase soy broth (Biorad) at 37°C to reach the exponential phase, with an optical density at 600 nm of 0.1-0.2. The exact bacterial density was determined by plating 10 − 5 dilutions onto tryptic soy agar. Bacterial suspension was diluted to 5000 CFU/ ml in PBS. Mice were infected by injection in the caudal vein with 1000 CFU of Salmonella Typhimurium in 200 μl. The infectious dose was verified following infection by serial dilutions of the inoculum plated on trypticase soy agar. Infected animals were monitored daily post infection, and body-condition scoring (score < 2) was used for clinical endpoint. A total of 148 mice from 35 CC strains, both males and females, in the age range of 7-20 weeks were tested in 18 experiments (N = 2 to 24 mice per experiment). B6 mice were included in every experiment and 129 mice in most of them, as susceptible and resistant controls, respectively. Salmonella Typhimurium strain Keller, originally obtained from Dr. Hugh Robson (Royal Victoria Hospital, Montreal, Quebec), was also used for infecting CC042/GeniUnc strain to confirm its extreme phenotype with another Salmonella strain.

In vivo bacterial loads
Mice were euthanized by exposure to CO 2 at day 4 post infection. Spleen and liver were removed aseptically, weighed, placed in 2 mL of isotonic saline and homogenized using a tissue homogenizer (T25 Ultra-Turrax, IKA). The resulting homogenate was diluted in 1× PBS and serial dilutions were plated on tryptic soy agar to determine organ bacterial load.

In vivo LPS response
Mice were injected intraperitoneally with either 0.5 ml of PBS alone or 0.5 mL of PBS containing 100 μg of protein-free (0.008%) Escherichia coli K235 LPS. After 90 min, the mice were euthanized, and serum was collected by cardiac puncture. Tumor necrosis factor alpha (TNF-α) concentration in serum was measured using Mouse TNF-alpha DuoSet kit (R&D system).
Genotyping and reconstruction of CC genome CC strains have been previously genotyped at Wellcome Trust Centre for Human Genetics (Oxford, UK) and at UNC (Chapel Hill, USA) with several high-density arrays, including Mouse Universal Genotyping Array (MUGA and MEGA-MUGA) containing respectively 7.5 and 77.8 K SNPs [31] and Mouse Diversity Array (MDA) containing 620 K SNPs [32]. All the polymorphic SNPs homozygous in all founder strains were selected and introduced in HAPPY format using build 37 of the mouse reference genome. Each CC genome was reconstructed as a haplotype mosaic using a Hidden Markov Model (HMM) in HAPPY software [33] to estimate the probabilities of descent from each founder strain at each locus. Even though the CC mice used were nearly inbred at the time of the experiment, several strains still had1 0% of heterozygous genome as determined by the joint heterozygosity of obligate ancestors. Therefore, we ran in the reconstruction process the HMM under the diploid heterogeneous model mode to trace back each chromosome separately, and to allow for residual heterozygosity, averaging the reconstructions over blocks of n = 20 consecutive SNPs to reduce computational complexity. We set the number of generations of inbreeding at 20 as previously described [34].

Statistical analysis
Bacterial load data analysis was performed using R statistical software. Analysis of Variance (ANOVA) was used for testing the influence of sex, age and experiment on the bacterial loads.
Mean bacterial loads were compared by one-way ANOVA and Tukey HSD test (Prism 6.0 software, GraphPad, La Jolla, CA, USA). Mean TNF-alpha responses to PBS and LPS injections were compared by two-way ANOVA and Holm-Šídák test (Prism 6.0 software, GraphPad, La Jolla, CA, USA).

QTL mapping
QTL mapping was performed under R statistical software (release version 3.2.0) with the happy.hbrem package [33,35]. Individual phenotypic data were transformed to account for experiment effect: i) first, data was fitted with the mean load of B6 control mice tested in each experiment using a linear regression model; ii) second, the residuals from the model were extracted and data were normalised; iii) third, mean values of normalised residuals for each strain were used for QTL mapping. The presence of a QTL was tested by an ANOVA test by comparing the fit of the genetic model with the null hypothesis. QTL mapping with CC strains consists of eight-way haplotype linear regression with additive model as previously described [34]. The number of observations for each strain was used to weight the strain averaged value in the regression analysis. Significance is reported as the -log 10 (P) value as computed by the R ANOVA function. Genome-wide significance (E < 0.5, E < 0.1 and E < 0.05) was estimated by permuting the CC strains (1000 tests). QTLs confidence intervals were defined using a 1.5 log 10 -drop method.
Founder effect estimation, merge analysis and candidate genes selection Founder contributions for each trait analyzed were determined by hierarchical Bayesian random-effects model using the happy.hbrem package. To distinguish the contribution of each founder, WSB/EiJ estimate was set to 0 and its effect represented by the mean value between all founders, while other founder effects were presented as the difference from the effect of WSB. To identify potentially causal SNPs in each QTL interval, we used the merge analysis [36]. Most SNPs have only two alleles, thus we merged the eight CC founders into (typically) two groups according to their allelic variation based on sequence data in the founder strains. Instead of testing for phenotypic differences between all eight founders to test for a QTL in a given interval, differences are tested between the groups of merged founders. The reduction in the dimension of the test results in increased merge logP-values compared with interval logP-values in variants responsible for the QTL. Merge analysis provides an efficient tool to prioritize SNPs within QTL intervals.

Diversity in response to S. Typhimurium infection
To explore the influence of genetic diversity of CC mice on their susceptibility to Salmonella Typhimurium infection, we infected groups of mice from 35 CC strains and measured spleen and liver bacterial loads at day 4 postinfection. To assess reproducibility and normalize data across experiments, B6 were included in all experiments and 129 in most of them, as reference susceptible and resistant strains, respectively.
Since the study was carried out in 18 successive experiments with mice from both sexes and at different ages, we firstly evaluated sex, age and experiment effects. No consistent significant differences were found between males and females (P = 0.03 and 0.89 for spleen and liver respectively, threshold P = 0.025 after Bonferroni correction), nor between mice of different ages (P = 0.46 and 0. 14 for spleen and liver). However, variations between experiments were statistically significant (P < 2.2 × 10 − 16 for both spleen and liver). To take into account the variability across experiments, we used the mean value of B6 mice tested in each experiment to adjust CC individual data to perform QTL mapping.

QTL mapping reveals five susceptibility loci
To identify host genes controlling the variation in susceptibility to Salmonella Typhimurium in CC strains, we tested the association between the median organ bacterial load of the 35 CC strains and the founder haplotype probabilities based on the genotyping data of the CC strains. Five QTLs were mapped at genome-wide E < 0.5, associated with spleen or liver bacterial load and will be referred to as Salmonella Typhimurium susceptibility loci-1 (Stsl1) to Stsl5, by order of decreasing statistical significance. QTL confidence intervals were established using 1.5 log10-drop. Figure 2 and Table 1 summarize the significant level, peak position, as well as interval width of each QTL.
The strain CC042 shows an exceptionally susceptible phenotype in our study. Since we suspected that extreme trait values could have a strong weight on QTL identification, we re-ran QTL mapping analysis without CC042. However, since the results were essentially similar (data no shown), we included all 35 strains in the analysis.
We wondered whether the same QTLs could have been identified with a smaller number of CC strains. To this end, we ran QTL mapping on random subsets of 15, 20, 25, 30 or 34 strains (500 permutations for each subsets of strains, and 35 permutations for the 34 subsets) and we computed the frequency at which Stsl1 and Stsl2 could be identified, and at which genome-wide significance (see Additional file 2: Figure S1). Subsets of less than 20 strains were almost never able to detect either of the two QTLs. Subsets of 25 and 30 strains detected Stsl1 in 45% and 90% of cases, while Stsl2 was found in 26% and 63% of cases respectively. Subsets of 34 strains detected Stsl1 in 100% of cases and Stsl2 in 97% of cases. This suggests that these two QTLs are robust.

Estimation of founder effect shows complexity
The presence of a QTL implies a contrast in mean trait values between mice carrying different haplotypes at the locus. To understand the underlying cause of the identified QTLs, we estimated for each of them the effects of the eight founder haplotypes across the QTL interval and at the peak location.
Stsl1 is localized on Chr 8 with a peak location at 12. 5 Mb (Fig. 3a). The founder contributions, calculated a b Fig. 2 QTLs associated with bacterial loads in spleen and liver after S. Typhimurium infection in CC strains. Genome-wide associations for bacterial loads in spleen (a) and liver (b) in 35 CC strains. X-axis: genome location; Y-axis: -log 10 (P) values of the test of association between genotype and phenotype. Genome-wide thresholds of association at E < 0.5, E < 0.1 and E < 0.05 significance levels are indicated respectively by horizontal gray, orange and red lines. (a) Spleen bacterial load thresholds respectively at -log 10 (P) = 2.9, 3.8 and 4.2. (b) Liver bacterial load thresholds respectively at -log 10 (P) = 2.9, 4.0 and 4.3 across the critical interval (Fig. 3b) and at the peak location (Fig. 3c), indicates a contrast between PWK/PhJ (higher spleen bacterial loads) versus 129S1/SvImJ and CAST/EiJ strains (lower values). Stsl2 is localized on Chr 10 with a peak at 52.3 Mb (Fig. 4a). The founder contribution indicates a contrast between PWK/PhJ and B6 (higher) versus NZO/HILtJ (lower, Fig. 4b and c). Stsl3 is composed of two distinct peaks on Chr 1, respectively Stsl3b at 79.2 Mb and Stsl3a at 83.9 Mb (see Additional file 3: Figure S2). The founder contribution is different between the two peaks. For Stsl3b, it indicates a contrast between 129S1/SvImJ and B6 (higher) versus PWK/PhJ (lower). For Stsl3a, it indicates a contrast between B6 (higher) versus the others. Stsl4 peak location is at 81.2 Mb on Chr 6 (see Additional file 4: Figure S3). The founder contribution is a contrast between NZO/ HILtJ and PWK/PhJ (higher) versus B6 (lower). Stsl5 peak location is at 84.8 Mb on Chr 17 (see Additional file 5: Figure S4). The founder contribution mainly shows a contrast between B6 (higher) versus NOD/ShiLtJ (lower). In conclusion, the different QTLs are caused by markedly distinct patterns of contrasts between founders, with multi-allelic variations involved.

Association analysis of sequence variants and candidate genes
The confidence intervals of the QTLs we mapped encompass too many genes to make assumptions on the most likely candidates (Table 1). In particular, a total of 144 genes and 62 genes were identified from public databases within the most significant QTLs, Stsl1 and Stsl2, respectively. In order to prioritize genes within these two intervals, we performed merge analysis [34,36,42] on SNP variants within these QTLs and combined the results with gene expression data.
Merge analysis reduces the dimensionality of statistical tests by merging data from strains which share the same allele at a given SNP. If a QTL is caused by a single variant with a particular strain distribution pattern (SDP) among the founders, those nearby SNPs with the same SDP will have higher logP-values than in an 8-way haplotype linear model, as a result from reduced dimensionality of the test.
As expected, we found a fraction of SNPs with higher merged logP-values (dots and triangles in Fig. 5) than interval mapping logP-value (continuous line) near Stsl1 and Stsl2, enabling to filter out the majority of SNPs (Fig. 5). For Stsl1, both multi-allelic (triangles) and biallelic (dots) variants were found among the significant SNPs, which likely reflects the complexity of the predicted founder contributions. For Stsl2, only bi-allelic SNPs (dots) were found among the significant SNPs. Therefore, merge analysis allowed us to reduce the number of candidate genes to 60 genes for Stsl1 (see Additional file 6: Table S2) and 11 genes for Stsl2 (see Additional file 7: Table S3), with only 32 genes and 6 genes possessing significant merge SNPs nearby (highlighted in red circles in Fig. 5) for Stsl1 and Stsl2, respectively. To further prioritize our candidate gene list, we prioritized genes expressed by immune cells. We used MGI, ENSEMBL, ImmGen and IMPC to evaluate gene expression levels in immune cells, ontology terms and, known functions or phenotypes. As a result of this combined analysis, Stsl1 interval most promising candidates are Cul4a, Lamp1, Mcf2l and Pcid2, which are expressed in immune cells along with significant SNPs nearby. Stsl2 interval contains 4 out of 11 genes expressed in immune cells, while the most promising gene Slc35f1 with significant SNPs nearby is not reported to be expressed in splenic immune cells.
Tlr4 is functional in the CC042 extreme susceptible strain CC042 strain showed extreme susceptibility to S. Typhimurium with a 1000-fold higher organ bacterial load than in B6. CC042 has inherited a B6 susceptible allele at Slc11a1 locus which contributes to its phenotype. However, additional alleles are required to explain its enhanced susceptibility. We observed that CC042 has also inherited a wild-derived PWK/PhJ haplotype at the Tlr4  The mouse genome location is on the X-axis and significance (−log 10 (P)) values on the Y-axis, with genome-wide thresholds of association at E < 0.5, E < 0.1 and E < 0.05 levels indicated respectively by the gray, orange and red lines. Peak location (maximum value of -log 10 (P)) is marked by a star. locus, which contains several missense mutations (Sanger mouse SNP viewer [43]). Tlr4-deficient mice show very high susceptibility to Salmonella infection due to a defective response to LPS [44,45]. Although we did not identify any QTL in the Chr 4 region which contains Tlr4, we wondered whether the CC042 allele of Tlr4 was functional. We first infected CC042, (Tlr4 +/+ B6 × CC042)F1 and (Tlr4 −/− B6 × CC042)F1 mice with S. Typhimurium and compared organ bacterial loads 4 days later. Wild-type Tlr4 +/+ B6 mice and mutant knock-out Tlr4 −/− B6 mice were included as positive and negative controls, respectively. Figure 6a shows that Tlr4 −/− B6 mice had mean bacterial loads as high as 10 10 CFUs in the spleen, > 1000-fold higher than in Tlr4 +/+ B6 mice. Very high bacterial loads were also observed in the spleen of CC042 mice. By contrast, (Tlr4 +/+ B6 × CC042)F1 and (Tlr4 −/− B6 × CC042)F1 mice had lower bacterial loads (10 8 CFUs) in the spleen, which were not statistically different from that measured in Tlr4 +/+ B6 mice. In the liver, Tlr4 −/− B6 and CC042 mice had bacterial loads higher than 10 8 CFUs, that is > 1000fold higher than the bacterial loads measured in Tlr4 +/+ B6 mice. The bacterial loads in the liver of (Tlr4 +/+ B6 × CC042)F1 and (Tlr4 −/− B6 × CC042)F1 mice were both 10 6 CFUs, 10-fold higher than in Tlr4 +/+ B6 mice (Fig. 6b). Since Tlr4 KO is fully recessive, we conclude that the Tlr4 allele is functional in CC042. a b Fig. 5 Merge analysis of sequence variants around Stsl1 and Stsl2 interval. The X-axis is genome location; the Y-axis is the -log 10 (P) of the test of association between locus and bacterial load. Only the critical interval for each locus is shown. The continuous black line is the genome scan result in Fig. 3. The dots correspond to the results of the merge analysis. Biallelic SNPs are in gray; Triallelic SNPs are in blue; Variants with more than three alleles are in red. The larger dots circled in red correspond to SNPs with the most significant merge -log10(P). All SNPs with merge -log 10 (P) < 1 are not shown. a) Stsl1 QTL on Chr 8. b) Stsl2 QTL on Chr 10 LPS is a major component of the outer membranes of Gram-negative bacteria, including Salmonella Typhimurium. TLR4-mediated LPS response results in the release of various pro-inflammatory cytokines, including TNF-α [46]. We also investigated the in vivo response of the CC042 strain to LPS. Wild-type Tlr4 +/+ B6 and mutant Tlr4 −/− B6 mice were used as positive and negative controls respectively. Mice were injected with PBS alone or PBS containing 100 μg LPS and their TNF-α serum concentration measured at 90 min post-injection. Figure  6c shows that all mice treated with PBS alone failed to release TNF-α and its levels were < 58 pg/ml. Wild-type Tlr4 +/+ B6 mice injected with LPS produced significant levels of TNF-α (1561 pg/ml) compared to PBS-treated Tlr4 +/+ B6 animals (ANOVA test, p < 0.0001). By contrast, Tlr4 −/− B6 mice injected with LPS failed to produce TNFα (110 pg/ml) and showed no difference with PBS-treated animals. Interestingly, CC042 mice had a significant increase in TNF-α levels compared to PBS-injected mice (803 pg/ml, p = 0.009). These results confirm that Tlr4 allele is functional in the CC042 strain.

Discussion
In this study, we have used Collaborative Cross mice to explore a genetic diversity larger than in previous studies, which could result in identifying novel phenotypes and host genes controlling susceptibility to Salmonella infection.
Compared to classical laboratory inbred strains, the 35 CC strains tested exhibited wider range of bacterial loads in the spleen and liver target organs at day 4 post infection. Interestingly, some CC strains showed phenotypes beyond the previously reported range, with four strains (CC011/Unc, CC024/Ge-niUnc, CC002/Unc and CC051/TauUnc) having 3 to 4-fold lower CFUs in spleen or liver than 129 resistant strain, and the CC042 strain having 1000-fold higher CFUs than B6 susceptible strain. Strain CC046/Unc exhibited a new phenotype combination with opposite CFU levels between the two target organs studied, (CFUs at high level similar to B6 in the liver and at low level similar to 129 in the spleen). Our findings demonstrate that the host genetic diversity provided by the CC population enables to unravel new diverse phenotypes previously unseen in classical laboratory strains. These extreme and rare strains represent new models to study the pathophysiology of Salmonella infections and to explore how host genetic differences affect susceptibility.
The contrast between strains could be influenced by non-genetic factors such as the microbiota. However, most strains were bred in the same room under SPF conditions and the IV route of infection we used bypasses the intestinal phase and results in rapid septicemia, minimizing a potential influence of microbiota.
We identified two significant and one suggestive QTLs for spleen bacterial load as well as two suggestive QTLs for liver bacterial load. Despite a high and positive Pearson correlation coefficient between the two phenotypes (R 2 = 0.88), no QTLs common to both organs were identified. In fact, none of the QTLs identified for spleen load was even close to significance for the liver, and reciprocally. Several factors may explain this finding. First, the correlation between the two traits may not be strong enough for a QTL primarily associated with one trait to be detected secondarily with the other trait. Second, the number of strains and the effect size of each QTL may be limiting. Finally, it is possible that bacterial proliferation in spleen and liver are under the control of different genes, and different mechanisms.
By using the CC reference population which includes three wild-derived founders, we expected to identify novel host genetic variants and mechanisms to infectious diseases. Previous studies identified various QTLs implicated in the differences in host immunity to infection with Salmonella Typhimurium: Immunity to Typhimurium-Ity [10,19,47,48], Modifier of Salmonella Typhimurium Susceptibility-Msts [49], Susceptibility to Salmonella Typhimurium Antigens-Ssta [50,51]. None of them localized in the same regions on Chr 8 and 10 that were identified in our study. Previous studies identified three QTLs for Salmonella susceptibility using crosses with susceptible wild-derived Mus m. molossinus MOLF/Ei strain [9,47]. Likewise, we found that the two significant QTLs we mapped involved a contrast with one of the three wild strains, highlighting the importance of wild-derived founders' contribution in the CC.
It is well known that the Slc11a1 and Tlr4 genes have major influences in the susceptibility to Salmonella of laboratory strains. B6 inbred strain is susceptible due to a single missense mutation in Slc11a1 [12]. The broad critical interval for Stsl3a and Stsl3b suggestive QTLs on Chr 1 (70-100 Mb) contains Slc11a1 (74.3 Mb). Interestingly, the CC042 strain that exhibits an extreme susceptibility phenotype has inherited a B6 susceptible haplotype at Scl11a1 locus. However, this allele alone is not sufficient to explain the extreme phenotype of this strain as three other CC strains that inherited the same Slc11a1 susceptible allele from B6 founder origin (CC0021/Unc, CC045/GeniUnc and C0061/GeniUnc) have 10 to 12-fold lower splenic bacterial loads than B6 (see Additional file 8: Figure S5). Another important gene involved in susceptibility to Gram-negative bacteria is Tlr4. Although no QTL was detected on Chr 4 where Tlr4 is localized, we assessed the functionality of Tlr4 in CC042. This strain inherited a wild-derived PWK/PhJ Tlr4 haplotype which contains several missense mutations. We confirmed by LPS stimulation that this PWK/ PhJ derived Tlr4 allele is functional in CC042 strain. Moreover, three other CC strains inherited the same PWK/PhJ Tlr4 haplotype (CC006/TauUnc, CC052/Gen-iUnc, CC061/GeniUnc, see Additional file 8: Figure S5) and have resistant to intermediate splenic bacterial loads (respectively 5.3, 4.34 and 4.94) which show that this allele is not associated with high susceptibility. These results emphasize that host genetic resistance to Salmonella Typhimurium is complex with many genes interacting. Major genes identified in classical laboratory strains may not have the same impact in a population harboring more genetic diversity.
In this study we used Salmonella Typhimurium strain SL1344. To confirm that our results are not specific to this bacterial strain, CC042 mice were infected with S. Typhimurium strain Keller. CC042 mice present the same degree of extreme susceptibility (1000 higher CFUs in spleen and liver, data not shown) as compared to B6 mice. This correlates with previous evidence in the literature that the same host susceptibility loci can be identified by different S. Typhimurium strains. Msts 1-4 loci were identified using S. Typhimurium C5 strain [49] and correspond to loci previously identified on Chr 1 (Slc11a1), Chr 6, Chr 11 (Ity2) and Chr 13 (Ity13) using S. Typhimurium Keller strain [9,19,47,48].
The confidence intervals of the two major QTLs, Stsl1 and Stsl2 contain too many genes to directly point at likely candidate causal genes. In order to prioritize them, we used sequence variation information and merge analysis strategy [36] combined with gene expression, known function and phenotypes, to refine Stsl1 and Stsl2 QTLs intervals and identify candidate genes. Within Stsl1 QTL, four genes are strong candidates based on known phenotypes: Cul4a (cullin 4A) deficient animal die in utero and Cul4a is essential for hematopoietic cell survival [52]; Lamp1 (lysosomal-associated membrane protein 1) is highly expressed in macrophages and is involved in autophagy as well as protecting NK cell from degranulation-associated damage [53,54]; Mcf2l (mcf.2 transforming sequence-like) targeted mutant mice (IMPC) have a decreased number of CD8-positive T cells; Pcid2 (PCI domain containing 2) is essential for spleen development and regulation of B cell differentiation [55]. Within Stsl2 QTL one gene is a high-potential candidate: Slc35f1 targeted mutant have a decreased lactate dehydrogenase activity (IMPC, Phenotype MP:0005571), which may alter the pyruvate metabolism pathway in Salmonella [56].
Using computer simulation, it was reported that at least 500 strains were required to map a single additive QTL that explains 5% of the phenotypic variation [20]. However, such large numbers are not available since9 5% of CC strains became extinct during the inbreeding process, so that only 70 strains are distributed [57,58]. In our study, we have shown that even a smaller number of CC strains (here, 35) can provide enough power to identify QTLs with genome-wide significance. We used our experimental data to estimate the minimum number of strains needed to identify the two major QTLs found with 35 strains. We found that 20 strains almost always missed them, while 30 strains were almost as successful as the full set of 35 at identifying Stsl1. This conclusion is dependent on the size of QTL effect.

Conclusion
By exploring a broader genetic variation, the Collaborative Cross population has revealed novel loci of resistance to Salmonella Typhimurium. It also led to the identification of CC042 as an extremely susceptible strain. This study provides further example of the power of the CC resource to observe novel phenotypes and identify additional host genes controlling quantitative traits such as the susceptibility to infections. These results will further enhance our capacity to understand the complex host-bacteria interplay.

Additional files
Additional file 1: Table S1. Individual organ bacterial loads at day 4 post-infection with Salmonella Typhimurium. Individual values for each animal tested are given: animal number (N), strain, alias (collected from UNC Systems Genetics), sex (Females | Males), spleen bacterial load as log 10 p-value of CFUs per gram of spleen (log10.CFUs.g.Spleen), liver bacterial load as log 10 p-value of CFUs per gram of liver (log10.CFUs.g. Liver) and experiment. NA: missing value. (XLSX 13 kb) Additional file 2: Figure S1. QTLs associated with bacterial loads in spleen after S. Typhimurium infection in different subsets of CC strains. X-axis: genome location of each QTL Stls1 and Stls2 identified in Fig. 2; Y-axis: probability of detecting QTLs at different genomic significance (E < 0.5 in gray, E < 0.1 in orange, E < 0.05 in red and combined in blue). Genome-wide thresholds of association at E < 0.5, E < 0. 1  Additional file 3: Figure S2. Founder contributions and haplotype around Stsl3 QTL on Chr 1. (A) Genome scan magnification for Stsl3 QTL region (70-100 Mb on Chr 1). The mouse genome location is on the Xaxis and significance (−log 10 (P)) values on the Y-axis, with genome-wide thresholds of association at E < 0.5, E < 0.1 and E < 0.05 levels indicated respectively by the gray, orange and red lines. Peak locations Stsl3a and Stsl3b (maximum value of -log 10 (P)) are marked by stars. (B) Founder contributions in the same magnified region. The peak location of Stsl3a is marked by a star. Each of the 8 founders is in a different color. The mouse genome location is on the X-axis and Y-axis shows the founder estimated effect on splenic bacterial load after S. Typhimurium infection. (C) Founder contributions at Stsl3a QTL peak (83.9 Mb). X-axis shows the different founder strains. Y-axis shows the estimated founder effect. No obvious contributions explain Stsl3a QTL, but B6 (grey) has the highest estimated impact of the 8 founders. (D) Founder contributions at Stsl3b QTL peak (79.2 Mb). There is no obvious founder contribution for Stsl3b QTL peak region. 129 (pink) has the highest estimated impact of the 8 founders while PWK (red) has the lowest estimate. (PDF 215 kb) Additional file 4: Figure S3. Founder contributions and haplotype around Stsl4 QTL on Chr 6. (A) Genome scan magnification for Stsl4 QTL region (60-100 Mb on Chr 6). The mouse genome location is on the Xaxis and significance (−log 10 (P)) values on the Y-axis, with genome-wide thresholds of association at E < 0.5, E < 0.1 and E < 0.05 levels indicated respectively by the gray, orange and red lines. Peak location (maximum value of -log 10 (P)) is marked by a star. (B) Founder contributions in the same magnified region. The peak location is marked by a star. Each of the 8 founders is in a different color. The mouse genome location is on the X-axis and Y-axis shows the founder estimated effect on splenic bacterial load after S. Typhimurium infection. (C) Founder contributions at Stsl4 QTL peak (81.2 Mb). X-axis shows the different founder strains. Y-axis shows the estimated founder effect. No obvious contributions explain Stsl4 QTL, but B6 has the lowest estimated impact while NZO/HILtJ and PWK/PhJ have the highest estimates. (PDF 160 kb) Additional file 5: Figure S4. Founder contributions and haplotype around Stsl5 QTL on Chr 17. (A) Genome scan magnification for Stsl5 QTL region (75-95 Mb on Chr 17). The mouse genome location is on the Xaxis and significance (−log 10 (P)) values on the Y-axis, with genome-wide thresholds of association at E < 0.5, E < 0.1 and E < 0.05 levels indicated respectively by the gray, orange and red lines. Peak location (maximum value of -log 10 (P)) is marked by a star. (B) Founder contributions in the same magnified region. The peak location is marked by a star. Each of the 8 founders is in a different color. The mouse genome location is on the X-axis and Y-axis shows the founder estimated effect on splenic bacterial load after S. Typhimurium infection. (C) Founder contributions at Stsl5 QTL peak (84.8 Mb). X-axis shows the different founder strains. Y-axis shows the estimated founder effect. No obvious contributions explain Stsl5 QTL, but B6 has the highest estimated impact while NOD/ShiLtJ has the lowest. (PDF 123 kb) Additional file 6: Table S2. Genes remaining in Stls1 interval post merge analysis. Gene symbol, start and end positions, name, high merged SNPs, expression in immune cell, cell-type major expression and Gene Ontology (GO) terms are given. Gene positions (build mm9), names as well as GO terms were collected from UCSC, MGI and ENSEMBL, while expression data were collected from Male/Female RNAseq of ImmGen. (XLSX 496 kb) Additional file 7: Table S3. Genes remaining in Stls2 interval post merge analysis. Gene symbol, start and end positions, name, high merged SNPs, expression in immune cell, cell-type major expression and Gene Ontology (GO) terms are given. Gene positions (build mm9), names as well as GO terms were collected from UCSC, MGI and ENSEMBL, while expression data were collected from Male/Female RNAseq of ImmGen. (XLSX 492 kb) Additional file 8: Figure S5. CC strains carrying either Tlr4 < PWK > or Slc11a1 < B6>. Same data as on Fig. 1. Strains carrying Slc11a1 < B6 > susceptible allele are highlighted in red boxes. Strains carrying Tlr4 < PWK > allele are highlighted in blue circles. None of these alleles is associated with higher or lower bacterial loads in spleen or liver. (PDF 231 kb)