Analysis of factors contributing to variation in the C57BL/6J fecal microbiota across German animal facilities

signiﬁcant associations to speciﬁc aspects of husbandry. Salient ﬁndings include a reduction in alpha diversity with the use of irradiated chow, an increase in inter-individual variability (beta diversity) with respect to barrier access and open cages and an increase in bacterial community divergence with time since importing from a vendor. We further observe a high degree of facility-level individuality, which is likely due to each facility harboring its own unique combination of multiple varying attributes of animal husbandry. While it is important to account and control for such differences between facilities, the documentation of such diversity may also serve as a valuable future resource for investigating the origins of microbial-driven host phenotypes.


a b s t r a c t
The intestinal microbiota is involved in many physiological processes and it is increasingly recognized that differences in community composition can influence the outcome of a variety of murine models used in biomedical research. In an effort to describe and account for the variation in intestinal microbiota composition across the animal facilities of participating members of the DFG Priority Program 1656 "Intestinal Microbiota", we performed a survey of C57BL/6J mice from 21 different mouse rooms/facilities located at 13 different institutions across Germany. Fresh feces was sampled from five mice per room/facility using standardized procedures, followed by extraction and 16S rRNA gene profiling (V1-V2 region, Illumina MiSeq) at both the DNA and RNA (reverse transcribed to cDNA) level. In order to determine the variables contributing to bacterial community differences, we collected detailed questionnaires of animal husbandry practices and incorporated this information into our analyses. We identified considerable variation in a number of descriptive aspects including the proportions of major phyla, alpha-and beta diversity, all of which displayed

Introduction
The lab mouse is a mainstay of biomedical research and accordingly one of the main experimental systems used to study microbial communities in a mammalian host. Although important differences in bacterial composition, anatomy, and physiological responses exist between mice and humans, mice still represent a well-characterized model to conceptualize universal mammalian responses (Lin et al., 2014;Seok et al., 2013;Takao and Miyakawa, 2015). Accordingly, considerable effort to standardize mouse research has been made since the beginning of the 20th century. The establishment and conservation of numerous in-and outbred mouse lines from diverse origins established a basis for genetic comparability and potential for independent experimental validation, although there are still possibilities for improvement in standardization and implementation (Ioannidis et al., 2014;Perrin, 2014).
In recent years host-associated bacterial communities gained the attention of numerous areas in biomedicine such as research on metabolic disorders (Le Chatelier et al., 2013), allergies (Hill et al., 2012;Stefka et al., 2014), diabetes (Qin et al., 2012), rheumatoid arthritis (Vaahtovuo et al., 2008;Zhang et al., 2015) and inflammatory bowel diseases (Manichanh et al., 2006). Experimental investigation of disease mechanisms usually takes place in mouse models. Examples include determining gene × environment effects on microbial communities in vivo using genetically altered mouse strains under different treatment regimens (Ley et al., 2005;Rausch et al., 2015), the manipulation of microbiota via cohousing (Seedorf et al., 2014) and dietary interventions . However, microbial communities are complex and often an uncontrolled variable in experiments. Beginning in the mid 1960's, efforts were taken to reduce the effects of different bacterial communities across experimental populations with the introduction of standardized minimal floras such as the "altered Schaedler flora" (Dewhirst et al., 1999;Gordon and Dubos, 1970;Schaedler et al., 1965), the complete removal of microbes altogether (gnotobiotic breeding), or a combination of both through the targeted inoculation of germ free mice (Gordon and Pesti, 1971). Germ free re-derivation and subsequent inoculation with a defined flora became a standard not only for commercial vendors, but for many mouse husbandries alike (Baker, 1966) and shows long term stability over several generations under specific pathogen-free housing conditions (Stehr et al., 2009). Other projects paralleling these efforts were successful in designing minimal functional microbial consortia, which are relatively easy to maintain, manipulate, and to analyze (Becker et al., 2011;Reyes et al., 2013).
Despite efforts to control the microbial composition of experimental animals, ample opportunity exists for microbial communities to diverge over time, both within and between animal facilities. This may be on the one hand due to inherent practices of husbandry, from differences in the treatment and provider of chow to policies of barrier access, hygiene, and types of cages used. On the other hand, underlying ecological processes such as the lack of dispersal/transmission and random fluctuations in abundance can lead communities to diverge over time (Hubbell, 2001;Simberloff and Wilson, 1969;Volkov et al., 2003). These forces are recognized to play a role in microbiome studies and appear to occur already at the scale of single cages and rooms (McCafferty et al., 2013;Rogers et al., 2014) and are reinforced by transgenerational bacterial transmission and coprophagy (Dominguez-Bello et al., 2010;Shanahan et al., 2014a,b;Ubeda et al., 2012). On a larger scale, such stochastic processes combined with environmental differences (e.g. chow supply, mouse strains, hygiene plan, etc.) lead to differences among facilities (Martiny et al., 2011), which may contribute to the lack of consistency in microbiome studies involving many mouse models (Laukens et al., 2016). However, still little is known regarding the relative impact of the many potential factors influencing variation in the lab mouse fecal microbiota over a broad scale.
In this study, we take advantage of the large number and diversity of animal facilities used by members of a German Science Foundation (DFG)-funded consortium (Priority Program 1656, "Intestinal Microbiota-A Microbial Ecosystem at the Edge between Immune Homeostasis and Inflammation") to systematically investigate the influence of mouse husbandry on variation in the intestinal microbiota. This was achieved through standardized sampling of the same C57BL/6J strain across a total of 21 animal barriers, followed by sequencing of the bacterial 16S rRNA gene using both DNA and RNA as template to analyze bacterial abundance and activity, respectively. We describe a number of known-and novel effects at multiple levels of ecological importance including taxon abundances, alpha and beta diversity, in addition to many specific associations of individual taxa to husbandry-related variables.

Results
In order to characterize and account for differences in the intestinal microbiota across the animal facilities participating in the German Science Foundation (DFG)-funded Priority Program 1656 "Intestinal Microbiota" (www.intestinal-microbiota.de), 13 participating institutions conducted representative sampling of their in-house C57BL/6J breeding stocks in 21 different animal facilities. For each facility, fresh feces from five adult (12-14 weeks of age) male C57BL/6J mice derived from independent families and cages were collected (i.e. in total 105 mice; see Methods for sampling procedures). Bacterial community profiles were generated by 16S rRNA gene amplicon sequencing (V1-V2 on Illumina MiSeq; see Section 4), based on both DNA and RNA (reverse transcribed to cDNA) as template to analyze bacterial abundance and activity, respectively. This yielded on average 14929.12 ± 54.30 SD (DNA) and 14942.72 ± 62.32 SD (RNA) sequences per sample (104 samples in the analyses, one sample excluded due to low coverage).

Housing and mouse characteristics influence phylum abundances
We identified vast variation already at this basic level of classification, with differences in the abundance of major and minor phyla across, and within the different animal facilities (Fig. 1A). As the ratio of Firmicutes/Bacteroidetes was previously associated to e.g. obesity phenotypes (Turnbaugh et al., 2006), we investigated whether this ratio correlates to body weight in our large, genetically identical cohort. Accordingly, we tested whether the values of each respective mouse correlates with the weight of animals within this unit, considering the correlation within each sampling unit (batch of 5 animals included in linear mixed model). We detected a significant quadratic relationship of Firmicutes/Bacteroidetes ratio with a minimum at a body weight of 30 g in DNA based samples (F 2,77 = 3.2721, P = 0.0433), but no relationship among RNA based samples ( Fig. 1B and C). We also checked the previously described relationship between body weight and the bacterial class Erysipelotrichia (class of Firmicutes, formerly Mollicutes) (Turnbaugh et al., 2008;Woting et al., 2014), which reveals a significant linear correlation in DNA (F 1,78 = 4.8816, P = 0.0301), which is again absent among the RNA based communities (F 1,78 = 2.5478, P = 0.1145). Together, these results imply a non-trivial relationship between animal weight and broad taxonomic composition, which is only present in the standing (DNA based) microbial communities, in congruence with previous studies.
Next, we analyzed phylum abundances with respect to numerous aspects of animal housing (see overview Table S1), which reveals several important patterns. To coarsely estimate an overall difference in husbandry conditions, we calculated the Gower distance between samples with respect to differences in mouse supplier, treatment of chow and bedding, chow provider, barrier access, presence of other strains, housing conditions, and water treatment among facilities (Podani, 1999). Mainly the phylum abundances based on RNA correlated strongly to this generalized difference in husbandry conditions as estimated by the first principle component of this matrix, capturing over 24% of differences in husbandry (linear mixed model with the first principle component of Gower distance; Firmicutes: DNA (X 1/2 )-F 1,82 = 1.4594, P = 0.2305; RNA (X 1/2 )-F 1,82 = 7.5284, P = 0.0075; Bacteroidetes: DNA-F 1,82 = 0.9034, P = 0.3447; RNA-F 1,82 = 2.7719, P = 0.0997; Proteobacteria: DNA (X 1/4 )-F 1,82 = 3.0403, P = 0.0850; RNA (X 1/4 )-F 1,82 = 8.8376, P = 0.0039). Further, each of the three most abundant phyla (Firmicutes, Bacteroidetes, Proteobacteria) displayed numerous significant individual associations to specific aspects of husbandry, i.e. barrier access, chow treatment, presence of other mice, and housing type (Table 1, Fig. S1). Thus, overall we observed great variation in taxonomic composition already on the coarse level of phylum abundances, caused by differing barrierand chow characteristics, which likely have physiological consequences.

Influence of housing conditions on alpha diversity
We next investigated the influence of husbandry on aspects of diversity within each sample, including the species distribution (Shannon H), species richness (Chao1) and phylogenetic diversity (Faith's phylogenetic diversity). In general a wide distribution of community complexity among the participating facilities is apparent, which is influenced by both husbandry and characteristics of the mice (Fig. 2Aa, Ba, Fig. S2Aa, Ba, S3Aa, Ba). To obtain a coarse estimate of the influence of husbandry conditions, we again used the first principle component of husbandry conditions based on Gower distances between samples (Podani, 1999). DNA and RNA based species numbers (and the other diversity metrics) strongly correlated to this generalized difference in husbandry conditions (linear mixed model with the first principle component of Gower distance; Chao1: DNA-F 1,82 = 9.9930, P = 0.0022; RNA-F 1,82 = 10.9797, P = 0.0014; Shannon H: DNA-F 1,82 = 6.9387, P = 0.0101; RNA-F 1,82 = 0.9435, P = 0.3342; PD: DNA-F 1,82 = 7.9257, P = 0.0061; RNA-F 1,82 = 8.5171, P = 0.0045, see Fig. 2Ab, Bb,Fig. S2Ab,Bb,S3Ab,Bb). In more detailed analyzes, chow treatment and housing condition showed a consistent influence on richness, evenness, and phylogenetic diversity, within the active and standing communities (Table 2). In particular, irradiation of animal chow appeared to strongly reduce community complexity, even com-pared to autoclaved chow (see Table 2, Fig. 2Ac, Bc,Fig. S2Ac,Bc,S3Ac,Bc). Surprisingly, non-IVC housing decreases species diversity, although this contradictory result may be explained by the other high hygiene standards present in these barriers.
We further analyzed the relationship of continuous variables (age, weight) on the diversity of microbial communities individually. The age of animals was in general positively correlated to species number, community entropy (Shannon H), and phylogenetic diversity of animals, pointing towards a process of succession of bacterial communities over time that is relatively universal among facilities (Table S2). Only community entropy of the active community (RNA) appears to be associated to body weight, with a slight decrease in diversity with increasing weight. We also analyzed the approximate number of active species within each sample by counting the potentially active operational taxonomic units The number of potentially active species is smaller than in either DNA or RNA based datasets, which implies a high proportion of allochthonous, potentially inactive species accessory to the active autochthonous core community (Fig. S4A), which is highly variable across samples (Fig. S4B). We detected strong influences of general husbandry conditions (linear mixed model with PC1 of Gower distance: F 1,82 = 10.4317, P = 0.0018), and more specifically of chow irradiation and chow provider on the number of active OTUs (Table 2, Fig. S4C, D), while neither age nor weight affected the number of active species OTUs (Table S2). In summary, alpha diversity appears to be strongly influenced by general housing and chow characteristics, which could also potentially interact.  Table 2).

Influence of housing conditions on beta diversity
Another important aspect of community characteristics is represented by difference in community composition across individuals, i.e. beta diversity. We investigated community differences based on the shared presence of microbial species (Jaccard, 1901), or their shared abundances (Bray and Curtis, 1957).
Overall we found pronounced differences between the facilities and husbandry conditions, which results in a clear clustering of samples based on their facility-of-origin, based on both DNA and RNA, and activity based analyses (for constrained ordinations see Fig. 3, for unconstrained ordinations and clustering dendrograms Fig. S6). These patterns appear to be clearest based on the Jaccard distance (presence/absence). By correlating the first Principle Component of husbandry conditions (Gower distance) to the respective PCoA ordinations, we find strong influences of housing on the community clustering. A more detailed analysis on individual husbandry characteristics also reveals consistent differences with respect to the mouse supplier, chow treatment, water treatment, housing type, the presence of other mouse strains and the accessibility of the barriers facility (Figs. S7, S8, Table S3). Overall a higher impact was observed for microbial communities profiled at the RNA level. Distance based Redundancy Analysis (Legendre and Anderson, 1999) reveals similar patterns to the indirect gradient analyses through unconstrained PCoA (above), but further enables the assessment of the amount of variation in community structure that can be explained by husbandry conditions, which ranges from 0.049 to 0.226 (adjusted R 2 , Fig. 3, Table 3) depending on data type and distance measure (Peres-Neto et al., 2006). The mouse vendors explain most variation among the microbial communities, followed by chow treatment and provider, thus enforcing the importance of the original founding community and the role of nutrition for community differentiation. We also included animal weight and age in those models which revealed qualitatively similar results, including the significant influences of animal age and weight on the community similarity (see Table S4, Fig. S9). Interestingly, age of animals is mainly correlated with community differences, while weight only appears to associate to the community distances in the stagnant DNA based communities, and not the species profiles based on RNA or activity (see Table S4, Fig. S9).
Next, in addition to community separation, we also analyzed the amount of inter-individual variation introduced by the individual aspects of animal husbandry. Overall there are several differences between DNA and RNA based community distances, which appear rather consistent across all beta diversity measures (see Table S5, Figs. S10, S11). Community variation in the DNA based spectra is influenced by almost all factors, except treatment of drinking water. Interestingly, we observed a consistent decrease in inter-individual variation among animals provided by Janvier compared to all other mouse suppliers, meaning that these mice display more homogeneous communities than those from other vendors (Table S5, Figs. S10, S11). In contrast, characteristics such as non-individuallyventilated-cage (non-IVC), the presence of other mouse strains and unrestricted access to the facilities appear to increase the interindividual variation of mice. Further, chow provider and chow treatment influence community variation dramatically (see Table  S5, Figs. S10, S11). Unsurprisingly, community variation and differentiation are influenced by a variety of factors; in particular the initial founding community (vendor), the environment and factors that alter the exchange of bacteria among animals influence the divergence between facilities.

Indicator species for housing conditions and predictive community fingerprinting
To identify the bacterial taxa contributing to the observed community differences we used indicator species analysis at the genus-level taxa. This method incorporates both taxon abundance and frequency of occurrence to associate single taxa to a given environment. Overall we could identify 60 and 99 consensus genera at the DNA and RNA level, respectively, which are significantly associated to a given facility, which is the central unit of sampling in this study (i.e. each batch of five repli-cate mice within an animal house) (Fig. 4, Table S6). Further, we also identified genus-associations to individual characteristics including mouse suppliers (26-DNA/61-RNA), water autoclaving  (Tables S7 and S8). Even some of the major intestinal genera such as Bacteroides and Prevotella are more associated to particular aspects of husbandry. Prevotella (including uncl. Prevotellaceae, DNA/RNA) appears to be highest in the absence of other mouse strains, in mice purchased from Janvier and when irradiated chow is used, whereas Bacteroides appear to be preferentially abundant in non-IVC housing (Tables S7, S8, Figs. S12, S13). Further, Mucispirillum, are mainly present in facilities without chow treatment and where mice were purchased from Charles River. This is consistent with Mucispirillum (Mucispirillum schaedleri, formerly Flexistipes spec.) being a member of the Charles River altered Schaedler flora (CRASF ® ) (Stehr et al., 2009). Desulfovibrio is a bacterium that produces hydrogen sulfide and its occurrence is associated with a variety of disorders, including colon cancer, metabolic and inflammatory bowel diseases (Attene-Ramos et al., 2006;Levine et al., 1998;Pitcher et al., 2000;Roediger et al., 1997), and appears to benefit from certain ingredients in Altromin chow and a lack of chow treatment, i.e. no autoclaving or irradiation (sampling group 21-DNA, 13-RNA, Tables S6-S8). Treponema (Spirochaetes) appears to be significantly reduced by water autoclaving, while Citrobacter occur in the presence of non- Fig. 3. Distance based Redundancy Analysis to assess the influence of husbandry conditions on community differences, calculated from shared OTU presence (Jaccard) and shared abundances (Bray-Curtis). Centroids for each significant housing factor (based on 5000 permutations) are plotted according to color code. Fig. 4. Consensus genera significantly associated with a sampling group (facility) and present with a mean abundance of at least 100 sequences per sample in the whole dataset based on DNA (A) and RNA (B).

Table 3
Results of permutative anova analyses based on distance based Redundancy Analyses (Jaccard-differential presence, Bray-Curtis-differential abundances) including explained variation of the full model and single factors.  Tables S6-S8).
Helicobacter hepaticus is commonly used in models for colitis and displays variation in pathology with respect to the underlying microbiota present (Yang et al., 2013). However, screening for Helicobacter spp. across our data set revealed its potential presence in only a single mouse. Interestingly, not only the abundance of potentially pathogenic bacteria, but also of major probiotic bacteria, e.g. Faecalibacterium, are altered by husbandry conditions like chow treatment, cage type, and even by original mouse vendor (Jackson, see Table S8, Figs. S12, S13). Similarly, we observe abundance differences among the active Clostridia of Cluster XIVa/b, which are associated to butyrate production and anti-inflammatory immune modulation (Atarashi et al., 2011;Chang et al., 2014;Furusawa et al., 2013;Segain et al., 2000). The abundance of these Clostridia is increased through the presence of other mouse strains, or being derived from the Jackson lab (see Table S8, Fig. S13).
We further made use of a machine learning algorithm (Ran-domForest (Breiman, 2001)) to perform supervised clustering and prediction of facilities and husbandry characteristics solely based on their microbial communities (consensus genera, and species level OTU abundances). We were able to classify and predict several different housing conditions, up to the level of single facilities with very high accuracy (see Table S9). The clustering success and prediction accuracy based on housing conditions were generally higher for DNA-compared to RNA-derived profiles. Clustering appears to be especially successful for bedding and chow characteristics, as well as for the presence/absence of other mouse strains and their original provider. Clustering of communities according to their facility showed a relatively low error rate, with the prediction of a sample being derived from a given facility based on a model trained on 4/5 of the genus-level abundance dataset was correct in 19/21 and 16/21 cases (DNA, RNA). Consensus genera abundances are also sufficient to predict mouse vendor and other husbandry characteristics (Table S9). Prediction of sampling groups based on species-level OTU abundances in the training set (4/5 of the samples) was correct in 20 out of 21 cases in DNA and RNA-based datasets, respectively. Furthermore, predictions of housing conditions based on the OTU abundances are more successful if based on the DNA profiling. Specifically chow provider and bedding treatment were classifiable and predictable from both data sources. In conclusion these results demonstrate the specificity and individuality of the lab mouse fecal microbiota, even under highly controlled environmental conditions and the same host genetic background, and many combined factors appear to contribute to this pattern.

Discussion
In this study we took advantage of sampling fecal microbiota over a broad range of animal facilities while maintaining a constant host genetic background. This enabled us to simultaneously evaluate many aspects of animal husbandry that are either knownor suspected to influence microbial diversity and composition, and subsequently assess their relative contribution. This revealed novel and valuable insight into the influence of e.g. commercial vendors ("legacy effects"), food/water treatment, and housing conditions, in addition to providing the opportunity to test other relevant aspects of host-microbial ecology such as the relationship between the Firmicutes/Bacteroidetes ratio and body weight and the effect of neutral ecological drift among different isolated facilities.
At the coarsest level of taxonomic resolution, we found phylumlevel taxon abundances to be influenced by mouse vendor and other barrier characteristics. As this also results in variation in the Firmicutes/Bacteroidetes ratio, we used this as an opportunity to investigate the controversial relationship of this ratio to obesity (DiBaise et al., 2012). These analyses revealed contrasting results based on abundances inferred from genomic 16S rRNA gene copy (DNA) compared to expressed 16S rRNA transcript (RNA), as well as no continuous increase in weight with increasing abundance of Firmicutes. The relationship between Erysipelotrichia, a class of the Firmicutes and weight, may be an alternative indicator of obesity (Turnbaugh et al., 2008), for which our study also offers support.
Although both short-and long-term effects of diet on the gut microbiota have been described for humans (David et al., 2014;Wu et al., 2011) and mice (Carmody et al., 2015;Parks et al., 2013;Wang et al., 2014), knowledge on the influence of chow treatment itself is lacking, other than for its influence on consumption or animal growth (Ford, 1977a,b). Autoclaving and other treatments like irradiation appear to influence the general physical properties of chow, but additionally alter nutrient content (Caulfield et al., 2008) and accessibility to the host and microbial community (Yoshida and Shinoda, 1982) and consumption (Ford, 1977a). Irradiation may have a stronger effect on the nutritional content than heat treatment (Caulfield et al., 2008), and may thus alter microbial communities through e.g. a higher peroxide content. In our survey the use of irradiated chow was restricted to only three facilities, which also differ in other husbandry characteristics. Thus, although our observations of reduced diversity in the context of irradiation are suggestive, these results also deserve experimental interrogation. Furthermore, texture and hardness are influenced by treatment and provider, which not only influences chow utilization (Ford, 1977a,b), but also appears to significantly impact microbial communities . Thus, the observed differences with respect to chow treatment and provider in our study may be due to differences in nutrient uptake, intestinal transit times, accessibility, or other processes Jumpertz et al., 2011;Kashyap et al., 2013).
The results of our survey imply that the gut microbiota of experimental animals appear to also be influenced by potential extrinsic factors such as contamination through shed skin or dust particles carried by other animals, care takers, or scientists alike (Barberán et al., 2015;Dunn et al., 2013). Specifically, we observe increased variation among mice held in open cages, in rooms where other strains are present, or in the presence of a less restrictive access policy for the sampled barrier. Further, the genus Propionibacterium being an indicator for open barriers is a clear suggestion of human skin contamination of the mouse microbiome (see Table S7). This however may have influence on the immune responses of the mice under these barrier conditions (Fujimura et al., 2014;Olszak et al., 2012) and can make valuable comparisons across barriers with slightly differing hygiene concepts harder. Specifically the antigen content may be altered by the respective food treatment, similar to bedding treatment and the community used by the vendor for initial inoculation, which leads to phenotypic/immunological consequences (Chang et al., 2012). So does early exposure to microbes and/or their antigens result in an ameliorated response to experimental colitis or asthma in mice (Olszak et al., 2012) and allergic reactions in humans (von Mutius and Vercelli, 2010).
In addition to community-level differences, we also observe a number of important differences in individual taxa among facilities. Previous studies found differences in the presence of segmented filamentous bacteria (Clostridiales-Candidatus Savagella (Thompson et al., 2012)) between mouse vendors to have substantial immunological consequences (Ivanov et al., 2009). Many other genera with the potential to alter experimental outcomes show differences among facilities and husbandry conditions in our study, such as Staphylococcus, Streptococcaceae and Enterobacteriaceae. We further identified Mucispirillum (potentially Mucispirillum schaedleri, formerly Flexistipes spec.) as being specific to Charles River-derived mice and those fed untreated chow. These bacteria burrow into the host's mucosa (Robertson et al., 2005) and are associated with gut inflammation, but are also a normal part of the mouse microbiome (El Aidy et al., 2014;Schwab et al., 2014) and the Charles River altered Schaedler flora (CRASF ® ) (Stehr et al., 2009). Other potentially probiotic bacteria that differ across facilities are the butyrate producers Faecalibacterium and Clostridia of Cluster XIVa/b. These bacteria are influenced by husbandry conditions and the original vendor and have been frequently associated to anti-inflammatory immune modulation in mice and humans (Atarashi et al., 2011;Furusawa et al., 2013;Segain et al., 2000;Sokol et al., 2008). Differences in the profile of short chain fatty acids have been reported to correlate with the abundance of those bacteria, but also in concert with the whole bacterial community (Rogers et al., 2014).
Housing conditions or vendor effects are not the only potential causes of bacterial community differences between facilities. As there is no exchange of bacteria between barriers, they can be seen as independent, isolated microbial islands of different sizes with intrinsic community dynamics. After import, enrichment of the microbiota is limited to factors within the facility, or through potential contaminants. Neutral dynamics in finite populations can become highly important if dispersal is limited, a process known as ecological drift (Hubbell, 2001;Simberloff and Wilson, 1969;Volkov et al., 2003). Stochastic abundance fluctuations may cause the loss or even dominance of single taxa, which will subsequently drive community differences independent of husbandry conditions, even if bacterial communities are similar by descent (McCafferty et al., 2013). This is also indicated in our study, as the time since mice were imported into a respective barrier strongly correlates with the distances between microbial communities (dbRDA stratified by vendor; DNA: Jaccard-F 1102 = 2.0191, P = 0.0086, adj. R 2 = 0.0098; Bray-Curtis-F 1102= 4.1908, P = 0.0032, adj. R 2 = 0.0300; RNA: Jaccard-F 1102 = 1.6842, P = 0.0004, adj. R 2 = 0.0066; Bray-Curtis-F 1102 = 2.7360, P = 0.0002, adj. R 2 = 0.0166). This effect is most prominent in differences in abundance between shared taxa. Whether this results in functional differences remains to be investigated, although there is a high likelihood (Rogers et al., 2014).
In conclusion, despite all efforts to maintain hygiene, standardize conditions and monitor the health of laboratory mice, their microbiota remains a highly variable phenotype, and our study sheds additional light on the factors inherent to mouse husbandry that can influence community composition and diversity at all taxonomic levels. The fact that each facility harbors its own unique combination of a multitude of variable factors is likely the reason for the high degree of facility-level individuality we observe. While it is essential to account-and control for such differences between facilities, the documentation of such diversity may also serve as a valuable future resource for investigating the origins of microbial-driven host phenotypes.

Animal husbandry and experimental treatments
Mice were kept under facility-specific husbandry conditions according to the respective EU (2010/63/EU), national (animal welfare law) and state regulations and sampled under standardized conditions. Fresh feces were sampled from five C57BL/6J males from the respective breeding barriers derived from 5 separate cages/families at 12-14 weeks of age (14.079 ± 4.496 SD weeks; a small number of individuals (n = 10) ranged from 21 to 28 weeks) in the morning before regular routines. Questionnaires were used to record the original mouse suppliers, when they were received, how many generations elapsed, which diet is fed and how it is treated. We further noted the cage system, frequency and supplier of bedding changes, the presence of enrichment or other mouse strains, the number of animal caretakers, access to the barrier, humidity, temperature, and animal weight.

DNA extraction and 16S rRNA gene sequencing
Fecal samples were sent on dry ice immediately after sampling and stored after arrival at −80 • C. DNA and RNA were extracted with the AllPrep DNA/RNA (Qiagen ® ) following the manufacturer's protocol. RNA was reverse transcribed to cDNA using random hexamers as previously described (Rehman et al., 2010). The 16S rRNA gene was amplified using uniquely barcoded primers flanking the V1 and V2 hypervariable regions (27F-338R) with fused MiSeq adapters in a 25 l PCR. We used 4 l of each forward and reverse primer (0.28 M), 0.5 l dNTPs (200 M each), 0.25 l Phusion Hot Start II High-Fidelity DNA Polymerase (0.5 Us), 5 l of HF buffer (Thermo Fisher Scientific, Inc., Waltham, MA, USA) and 1 l of undiluted DNA or cDNA. PCRs were conducted with the following cycling conditions (98 • C-30 s, 30 × [98 • C-9s, 55 • C-60s, 72 • C-90s], 72 • C-10 min) and checked on a 1.5% agarose gel. The concentration of the amplicons was estimated using a Gel Doc TM XR+ System coupled with Image Lab TM Software (BioRad, Hercules, CA USA) with 3 l of O'GeneRulerTM 100 bp Plus DNA Ladder (Thermo Fisher Scientific, Inc., Waltham, MA, USA) as the internal standard for band intensity measurement. The samples of individual gels were pooled into approximately equimolar subpools as indicated by band intensity and measured with the Qubit dsDNA br Assay Kit (Life Technologies GmbH, Darmstadt, Germany). Subpools were mixed in an equimolar fashion and stored at −20 • C until sequencing. Sequencing was performed on the Illumina MiSeq platform with v3 chemistry.

Sequence processing and quality control
Raw BCL files were transformed to FASTQ via CASVA-1.8.2 using both forward and reverse barcodes. Sequence pairs were filtered and merged via USEARCH 8.0.1616 (quality threshold for truncation: 5; minimum length single: 200; minimum length after merge: 300; maximum length after merge: 350; minimum overlap: 100; maximum differences in overlap: 1). Quality filtering was done using the FASTX tool with a minimum quality score of 30/33 in at least 99% of the sequence. A second round of quality filtering was performed via USEARCH (-fastq maxee 0.1). Chimeric sequences were determined using USEARCH (database informed UCHIME algorithm and the rdp gold database) (Edgar, 2010). Sequences were classified and confirmed as bacterial using the RDP classifier with ≥60% bootstrap support (1000 iterations) with the RDP10 database version as provided by P. Schloss as a training set (Wang et al., 2007). For all downstream analyses of diversity and habitat association, we took a random subset of 15000 sequences per sample to normalize the read distribution. OTU binning was performed via USEARCH with recentering based on a two-step clustering scheme. We first clustered length sorted fasta sequences at the 97% identity level, and then subsequently sorted them by their abundance, followed by another round of clustering at 97% identity. The resulting set of OTU centroid consensus sequences were then used for remapping of the original sequences. This resulted in high overall species coverage in the DNA and RNA based sequence sets (Good's Coverage-DNA: 94.80% ± 2.78% SD, RNA: 95.78% ± 3.45% SD). Phylogenetic tree construction on representative OTU sequences (consensus sequences of centroids) was carried out using FastTree 2.1 using the default CAT substitution model but gamma correction, based on the mothur v.1.31.2 produced NAST-alignment to the SILVA database (Schloss, 2010;Schloss et al., 2009). To improve accuracy of the phylogenetic trees we modified the tree search with more minimum evolution rounds for the initial tree search [-spr 4], more exhaustive tree search [-mlacc 2], and a slower initial tree search [-slownni] (Price et al., 2010). The raw anonymized sequence data can be accessed online under the accession number PRJEB12853 at the European Nucleotide Archive.

Statistical methods
Species diversity indices (species richness, Shannon-Weaver index, phylogenetic diversities) were calculated in R (Chao, 1987;Faith, 1992;Helmus et al., 2007;Kembel et al., 2010;Oksanen et al., 2011). The phylogenetic measures of beta diversity, unweighted/weighted UniFrac, were calculated in mothur . Beta diversity metrics based on differential OTU presence (Jaccard distance)-or abundance (Bray-Curtis distance) were calculated in the vegan package for R. Distances were ordinated via average distance clustering and Principal Coordinate Analysis (PCoA) and fit of clusters was assessed via an iterative process using the envfit function implemented in vegan (5000 permutations). For constrained ordination we used distance based Redundancy Analysis (dbRDA) (Legendre and Anderson, 1999). Significance of factors and axes in distance based RDA (dbRDA) was ascertained using a permutative anova approach to assess model factors and the constrained dimensions (5000 permutations). Further analyses of variation in community composition were investigated by comparing the size-corrected distances of samples to the centroid of their cluster among factor levels, via a permutative anova approach ("betadisper"). Univariate analyses were carried out with linear mixed models with the sampling unit (batch) as a random variable and a correlation of measures within each respective facility with the nlme package. Model selection was performed using the conditional AIC criterion and its weights as implemented in dredge (MuMIn package) (Bartoń, 2013), with an appropriate distribution of model residuals after refitting of the final model under REML as a requirement (Pinheiro et al., 2011). Pairwise post-hoc tests of final model parameters were performed by glht (multcomp package) via a Tukey honest significance test among all levels of each respective model factor at once. The Gower distance between samples based on housing conditions (mouse supplier, treatment of chow and bedding, chow provider, barrier access, presence of other strains, housing conditions, and water treatment) was calculated with "gowdis" in the FD package for R and its first Principle Component was extracted capturing 24.55% of variation (Laliberté and Legendre, 2010;Laliberté et al., 2014;Podani, 1999). Indicator species analysis was based on 5000 permutations using the generalized indicator value (IndVal.g) to assess the predictive value of a consensus genus for each respective host/husbandry condition, as implemented in indicspecies (De Cáceres et al., 2010). P-values of the genus-level associations were adjusted by the Benjamini-Hochberg procedure (Benjamini and Hochberg, 1995). We further performed randomForest analyses on species level OTUs and consensus genera with five runs of tenfold cross-validation (80% of data in each step) and tuning in caret, excluding single sequence occurrences in the datasets (Breiman, 2001;Kuhn, 2008;Liaw and Wiener, 2002). First, general supervised classification was performed using the combined out-of-bag errors (OOB) as a general measure of classification accuracy on the full datasets excluding singletons. In addition, we split the dataset into a training (4/5) and test set (1/5) by randomly selecting one in five samples within each sampling unit. This explicitly allowed us to evaluate the accuracy and performance of microbial community fingerprints to distinguish samples among single facilities and husbandry conditions.